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1 

COMPOUNDS AND METHODS FOR IMMUNOTHERAPY 
AND DIAGNOSIS OF TUBERCULOSIS 

5 TECHNICAL FIELD 

The present invention relates generally to detecting, treating and 
preventing Mycobacterium tuberculosis infection. The invention is more particularly 
related to polypeptides comprising a Mycobacterium tuberculosis antigen, or a portion 
or other variant thereof, and the use of such polypeptides for diagnosing and vaccinating 
10 against Mycobacterium tuberculosis infection. 

BACKGROUND OF THE INVENTION 

Tuberculosis is a chronic, infectious disease, that is generally caused by 
infection with Mycobacterium tuberculosis. It is a major disease in developing 

1 5 countries, as well as an increasing problem in developed areas of the world, with about 
8 million new cases and 3 million deaths each year. Although the infection may be 
asymptomatic for a considerable period of time, the disease is most commonly 
manifested as an acute inflammation of the lungs, resulting in fever and a nonproductive 
cough. If left untreated, serious complications and death typically result. 

20 Although tuberculosis . can generally be controlled using extended 

antibiotic therapy, such treatment is not sufficient to prevent the spread of the disease. 
Infected individuals may be asymptomatic, but contagious, for some time. In addition, 
although compliance with the treatment regimen is critical, patient behavior is difficult 
to monitor. Some patients do not complete the course of treatment, which can lead to 

25 ineffective treatment and the development of drug resistance. 

Inhibiting the spread of tuberculosis requires effective vaccination and 
accurate, early diagnosis of the disease. Currently, vaccination with live bacteria is the 
most efficient method for inducing protective immunity. The most common 
Mycobacterium employed for this purpose is Bacillus Calmette-Guerin (BCG), an 

30 avirulent strain of Mycobacterium bovis. However, the safety and efficacy of BCG is a 
source of controversy and some countries, such as the United States, do not vaccinate 
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the general public. Diagnosis is commonly achieved using a skin test, which involves 
intradermal exposure to tuberculin PPD (protein-purified derivative). Antigen-specific 
T cell responses result in measurable induration at the injection site by 48-72 hours after 
injection, which indicates exposure to Mycobacterial antigens. Sensitivity and 

5 specificity have, however, been a problem with this test, and individuals vaccinated 
with BCG cannot be distinguished from infected individuals. 

While macrophages have been shown to act as the principal effectors of 
M tuberculosis immunity, T cells are the predominant inducers of such immunity. The 
essential role of T cells in protection against M. tuberculosis infection is illustrated by 

10 the frequent occurrence of M. tuberculosis in AIDS patients, due to the depletion of 
CD4 T cells associated with human immunodeficiency virus (HIV) infection. 
Mycobacterium-reactive CD4 T cells have been shown to be potent producers of 
gamma-interferon (IFN-y), which, in turn, has been shown to trigger the anti- 
mycobacterial effects of macrophages in mice. While the role of IFN-y in humans is 

15 less clear, studies have shown that 1 ,25-dihydroxy-vitamin D3, either alone or in 
combination with IFN-y or tumor necrosis factor-alpha, activates human macrophages 
to inhibit M. tuberculosis infection. Furthermore, it is known that IFN-y stimulates 
human macrophages to make 1,25-dihydroxy-vitamin D3. Similarly, IL-12 has been 
shown to play a role in stimulating resistance to M. tuberculosis infection. For a review 

20 of the immunology of M. tuberculosis infection see Chan and Kaufmann in 
Tuberculosis: Pathogenesis, Protection and Control, Bloom (ed.), ASM Press, 

Washington, DC, 1994. 

Accordingly, there is a need in the art for improved vaccines and 
methods for preventing, treating and detecting tuberculosis. The present invention 
25 fulfills these needs and further provides other related advantages. 

SUMMARY OF THE INVENTION 

Briefly stated, this invention provides compounds and methods for 
preventing and diagnosing tuberculosis. In one aspect, polypeptides are provided 
30 comprising an immunogenic portion of a soluble M. tuberculosis antigen, or a variant of 
such an antigen that differs only in conservative substitutions and/or modifications. In 
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one embodiment of this aspect, the soluble antigen has one of the following N-terminal 
sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- 
Gln-Val-Val-Ala-Ala-Leu; (SEQ ID No. 1 20) 

5 (b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 

Ser; (SEQ ID No. 121) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
10 Pro; (SEQ ID No. 123) 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; 
(SEQ ID No. 124) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID 
No. 125) 

1 5 (g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser- 

Pro-Pro-Ser; (SEQ ID No. 126) 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thx-Asp-Thr- 
Gly; (SEQ ID No. 127) 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu- 
20 Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- 

Ala-Asn; (SEQ ID No. 128) 

(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 
Ser; (SEQ ID No. 134) 

(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 
25 Asp; (SEQ ID No. 135) or 

(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQ ID No. 136) 
wherein Xaa may be any amino acid. 

In a related aspect, polypeptides are provided comprising an 
30 immunogenic portion of an M. tuberculosis antigen, or a variant of such an antigen that 
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differs only in conservative substitutions and/or modifications, the antigen having one 

of the following N-terminal sequences: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 

Ile-Asn-Val-His-Leu-Val; (SEQ ID No. 137) or 
5 (n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- 

Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) 

wherein Xaa may be any amino acid. 

In another embodiment, the soluble M. tuberculosis antigen comprises an 
amino acid sequence encoded by a DNA sequence selected from the group consisting of 
10 the sequences recited in SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 99 and 101, the 
complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 99 and 101 or a complement thereof 
under moderately stringent conditions. 

In a related aspect, the polypeptides comprise an immunogenic portion 
15 of a M tuberculosis antigen, or a variant of such an antigen that differs only in 
conservative substitutions and/or modifications, wherein the antigen comprises an 
amino acid sequence encoded by a DNA sequence selected from the group consisting of 
the sequences recited in SEQ ID Nos.: 26-51, 138, 139, 163-183 and 201, the 
complements of said sequences, and DNA sequences that hybridize to a sequence 
20 recited in SEQ ID Nos.: 26-51, 138, 139, 163-183 and 201 or a complement thereof 
under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 
expression vectors comprising these DNA sequences and host cells transformed or 
transfected with such expression vectors are also provided. 
25 In another aspect, the present invention provides fusion proteins 

comprising a first and a second inventive polypeptide or, alternatively, an inventive 
polypeptide and a known M. tuberculosis antigen. 

Within other aspects, the present invention provides pharmaceutical 
compositions that comprise one or more of the above polypeptides, or a DNA molecule 
30 encoding such polypeptides, and a physiologically acceptable carrier. The invention 
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also provides vaccines comprising one or more of the polypeptides as described above 
and a non-specific immune response enhancer, together with vaccines comprising one 
or more DNA sequences encoding such polypeptides and a non-specific immune 
response enhancer. 

5 In yet another aspect, methods are provided for inducing protective 

immunity in a patient, comprising administering to a patient an effective amount of one 
or more of the above polypeptides. 

In further aspects of this invention, methods and diagnostic kits are 
provided for detecting tuberculosis in a patient. The methods comprise contacting 

1 0 dermal cells of a patient with one or more of the above polypeptides and detecting an 
immune response on the patient's skin. The diagnostic kits comprise one or more of the 
above polypeptides in combination with an apparatus sufficient to contact the 
polypeptide with the dermal cells of a patient. 

In yet other aspects, methods are provided for detecting tuberculosis in a 

15 patient, such methods comprising contacting dermal cells of a patient with one or more 
polypeptides encoded by a DNA sequence selected from the group consisting of SEQ 
ID Nos.: 3, 1 1, 12, 140, 141, 156-160, 189-193, 199, 200 and 203, the complements of 
said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID 
Nos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200 and 203; and detecting an 

20 immune response on the patient's skin. Diagnostic kits for use in such methods are also 
provided. 

These and other aspects of the present invention will become apparent 
upon reference to the following detailed description and attached drawings. All 
25 references disclosed herein are hereby incorporated by reference in their entirety as if 
each was incorporated individually. 

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE IDENTIFIERS 

Figure 1 A and B illustrate the stimulation of proliferation and interferon- 
30 y production in T cells derived from a first and a second M. tuberculosis-immune donor, 
respectively, by the 14 Kd, 20 Kd and 26 Kd antigens described in Example 1. 
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Figure 2 illustrates the stimulation of proliferation and interferon-y 
production in T cells derived from an M. tuberculosis-immune individual by the two 
representative polypeptides TbRa3 and TbRa9. 

Figures 3A-D illustrate the reactivity of antisera raised against secretory 
5 M. tuberculosis proteins, the known M tuberculosis antigen 85b and the inventive 
antigens Tb38-1 and TbH-9 5 respectively, with M. tuberculosis lysate (lane 2), M 
tuberculosis secretory proteins (lane 3), recombinant Tb38-1 (lane 4), recombinant 
TbH-9 (lane 5) and recombinant 85b (lane 5). 

Figure 4A illustrates the stimulation of proliferation in a TbH-9-specific 
10 T cell clone by secretory M. tuberculosis proteins, recombinant TbH-9 and a control 
antigen, TbRall. 

Figure 4B illustrates the stimulation of interferon-y production in a TbH- 
9-specific T cell clone by secretory M tuberculosis proteins, PPD and recombinant 
TbH-9. 

15 Figures 5 A and B illustrate the stimulation of proliferation and 

interferon-y production in TbH9-specific T cells by the fusion protein TbH9-Tb38-l. 

Figures 6A and B illustrate the stimulation of proliferation and 
interferon-y production in Tb3 8-1 -specific T cells by the fusion protein TbH9«Tb38-l . 

Figures 7A and B illustrate the stimulation of proliferation and 
20 interferon-y production in T cells previously shown to respond to both TbH-9 and Tb38- 
1 by the fusion protein TbH9-Tb38-l. 

Figures 8A and B illustrate the stimulation of proliferation and 
interferon-y production in T cells derived from a first M. tuberculosis-immune 
individual by the representative polypeptides XP-1, RDIF6, RDIF8, RDIF10 and 
25 RDIF11. 

Figures 9A and B illustrate the stimulation of proliferation and 
interferon-y production in T cells derived from a second M tuberculosis-immune 
individual by the representative polypeptides XP-1, RDIF6, RDIF8, RDIF10 and 
RDIF11. 

30 
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8 

SEQ. ID NO. 31 is the DNA sequence of TbH-5. 
SEQ. ID NO. 32 is the DNA sequence of TbH-8. 
SEQ. ID NO. 33 is the DNA sequence of TbH-9. 
SEQ. ID NO. 34 is the DNA sequence of TbM-1. 
SEQ. ID NO. 35 is the DNA sequence of TbM-3. 
SEQ. ID NO. 36 is the DNA sequence of TbM-6. 
SEQ. ID NO. 37 is the DNA sequence of TbM-7. 
SEQ. ID NO. 38 is the DNA sequence of TbM-9. 
SEQ. ID NO. 39 is the DNA sequence of TbM-12. 
SEQ. ID NO. 40 is the DNA sequence of TbM-1 3. 
SEQ. ID NO. 41 is the DNA sequence of TbM-1 4. 
SEQ. ID NO. 42 is the DNA sequence of TbM-1 5. 
SEQ. ID NO. 43 is the DNA sequence of TbH-4. 
SEQ. ID NO. 44 is the DNA sequence of TbH-4-FWD. 

SEQ. ID NO. 45 is the DNA sequence of TbH-12. 

SEQ. ID NO. 46 is the DNA sequence of Tb38-1 . 

SEQ. ID NO. 47 is the DNA sequence of Tb38-4. 

SEQ. ID NO. 48 is the DNA sequence of TbL-17. 

SEQ. ID NO. 49 is the DNA sequence of TbL-20. 

SEQ. ID NO. 50 is the DNA sequence of TbL-21 . 

SEQ. ID NO. 51 is the DNA sequence of TbH-16. 

SEQ. ID NO. 52 is the DNA sequence of DPEP. 

SEQ. ID NO. 53 is the deduced amino acid sequence of DPEP. 

SEQ. ID NO. 54 is the protein sequence of DPV N-terminal Antigen. 

SEQ. ID NO. 55 is the protein sequence of AVGS N-terminal Antigen. 

SEQ. ID NO. 56 is the protein sequence of AAMK N-terminal Antigen. 

SEQ. ID NO. 57 is the protein sequence of YYWC N-terminal Antigen. 

SEQ. ID NO. 58 is the protein sequence of DIGS N-terminal Antigen. 

SEQ. ID NO. 59 is the protein sequence of AEES N-terminal Antigen. 

SEQ. ID NO. 60 is the protein sequence of DPEP N-terminal Antigen. 
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SEQ. ID NO. 91 is the deduced amino acid sequence of TbH-9. 

SEQ. ID NO. 92 is the deduced amino acid sequence of TbH-12. 

SEQ. ID NO. 93 is the amino acid sequence of Tb38-1 Peptide 1. 

SEQ. ID NO. 94 is the amino acid sequence of Tb38-1 Peptide 2. 

SEQ. ID NO. 95 is the amino acid sequence of Tb38-1 Peptide 3. 

SEQ. ID NO. 96 is the amino acid sequence of Tb38-1 Peptide 4. 

SEQ. ID NO. 97 is the amino acid sequence of Tb38-1 Peptide 5. 

SEQ. ID NO. 98 is the amino acid sequence of Tb38-1 Peptide 6. 

SEQ. ID NO. 99 is the DNA sequence of DP AS. 

SEQ. ID NO. 100 is the deduced amino acid sequence of DP AS. 

SEQ. ID NO. 101 is the DNA sequence of DPV. 

SEQ. ID NO. 102 is the deduced amino acid sequence of DPV. 

SEQ. ID NO. 103 is the DNA sequence of ESAT-6. 

SEQ. ID NO. 104 is the deduced amino acid sequence of ESAT-6. 

SEQ. ID NO. 105 is the DNA sequence of TbH-8-2. 

SEQ. ID NO. 106 is the DNA sequence of TbH-9FL. 

SEQ. ID NO. 107 is the deduced amino acid sequence of TbH-9FL. 

SEQ. ID NO. 108 is the DNA sequence of TbH-9- 1. 

SEQ. ID NO. 109 is the deduced amino acid sequence of TbH-9- 1 . 

SEQ. ID NO. 1 10 is the DNA sequence of TbH-9-4. 

SEQ. ID NO. 1 1 1 is the deduced amino acid sequence of TbH-9-4. 

SEQ. ID NO. 1 12 is the DNA sequence of Tb38-1F2 IN. 

SEQ. ID NO. 1 13 is the DNA sequence of Tb38-2F2 RP. 

SEQ. ID NO. 1 14 is the deduced amino acid sequence of Tb37-FL. 

SEQ. ID NO. 1 1 5 is the deduced amino acid sequence of Tb38-IN. 

SEQ. ID NO. 1 16 is the DNA sequence of Tb38-1F3. 

SEQ. ID NO. 1 17 is the deduced amino acid sequence of Tb38-1F3. 

SEQ. ID NO. 118 is the DNA sequence of Tb38-1F5. 

SEQ. ID NO. 1 19 is the DNA sequence of Tb38-1F6. 

SEQ. ID NO. 120 is the deduced N-terminal amino acid sequence of DPV. 
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15 



20 



25 



SEQ. ID NO. 121 
SEQ. ID NO. 122 
SEQ. ID NO. 123 
SEQ. ID NO. 124 
SEQ. ID NO. 125 
SEQ. ID NO. 126 
SEQ. ID NO. 127 
SEQ. ID NO. 128 
SEQ. ID NO. 129 



s the deduced N-terminal amino acid sequence of AVGS. 
s the deduced N-terminal amino acid sequence of AAMK. 
s the deduced N-terminal amino acid sequence of YYWC. 
s the deduced N-terminal amino acid sequence of DIGS, 
s the deduced N-terminal amino acid sequence of AEES. 
s the deduced N-terminal amino acid sequence of DPEP. 
s the deduced N-terminal amino acid sequence of APKT. 
s the deduced amino acid sequence of DPAS. 
s the protein sequence of DPPD N-terminal Antigen. 
SEQ ID NO. 130-133 are the protein sequences of four DPPD cyanogen 
bromide fragments. 



SEQ ID NO. 134 
SEQ ID NO. 135 
SEQ ID NO. 136 
SEQ ID NO. 137 
SEQ ID NO. 138 
SEQ ID NO. 139 
SEQ ID NO. 140 
SEQ ID NO. 141 
SEQ ID NO. 142 
SEQ ID NO. 143 
SEQ ID NO. 144 
SEQ ID NO. 145 



30 



s the N-terminal protein sequence of XDS antigen, 
s the N-terminal protein sequence of AGD antigen, 
s the N-terminal protein sequence of APE antigen, 
s the N-terminal protein sequence of XYI antigen, 
s the DNA sequence of TbH-29. 
s the DNA sequence of TbH-30. 
s the DNA sequence of TbH-32. 
s the DNA sequence of TbH-33. 
s the predicted amino acid sequence of TbH-29. 
s the predicted amino acid sequence of TbH-30. 
s the predicted amino acid sequence of TbH-32. 
s the predicted amino acid sequence of TbH-33. 
SEQ ID NO: 146-151 are PCR primers used in the preparation of a fusion 
protein containing TbRa3, 38 kD and Tb38-1 . 

SEQ ID NO: 152 is the DNA sequence of the fusion protein containing TbRa3, 
38kDand Tb38-1. 

SEQ ID NO: 153 is the amino acid sequence of the fusion protein containing 
TbRa3, 38kDandTb38-l. 

SEQ ID NO: 1 54 is the DNA sequence of the M. tuberculosis antigen 38 kD. 
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SEQ ID NO: 155 is the amino acid sequence of the M. tuberculosis antigen 38 
kD. 

SEQ ID NO: 1 56 is the DNA sequence of XP14. 
SEQ ID NO: 1 57 is the DNA sequence of XP24. 
5 SEQ ID NO: 158 is the DNA sequence of XP31. 

SEQ ID NO: 1 59 is the 5' DNA sequence of XP32. 
SEQ ID NO: 160 is the 3' DNA sequence of XP32. 
SEQ ID NO: 161 is the predicted amino acid sequence of XP14. 
SEQ ID NO: 162 is the predicted amino acid sequence encoded by the reverse 
1 o complement of XP 1 4. 

SEQ ID NO: 163 is the DNA sequence of XP27. 
SEQ ID NO: 164 is the DNA sequence of XP36. 
SEQ ID NO: 165 is the 5' DNA sequence of XP4. 
SEQ ID NO: 166 is the 5' DNA sequence of XP5. 
15 SEQ ID NO: 167 is the 5' DNA sequence of XP1 7. 

SEQ ID NO: 168 is the 5' DNA sequence of XP30. 
SEQ ID NO: 169 is the 5' DNA sequence of XP2. 
SEQ ID NO: 170 is the 3' DNA sequence of XP2. 
SEQ ID NO: 171 is the 5' DNA sequence of XP3. 
20 SEQ ID NO: 172 is the 3' DNA sequence of XP3. 

SEQ ID NO: 173 is the 5' DNA sequence of XP6. 
SEQ ID NO: 1 74 is the 3' DNA sequence of XP6. 
SEQ ID NO: 1 75 is the 5' DNA sequence of XP1 8. 
SEQ ID NO : 1 76 is the 3 ' DNA sequence of XP 1 8 . 
25 SEQ ID NO: 177 is the 5' DNA sequence of XP 19. 

SEQ ID NO: 1 78 is the 3' DNA sequence of XP19. 
SEQ ID NO: 1 79 is the 5' DNA sequence of XP22. 
SEQ ID NO: 1 80 is the 3' DNA sequence of XP22. 
SEQ ID NO: 1 81 is the 5' DNA sequence of XP25. 
30 SEQ ID NO: 182 is the 3' DNA sequence of XP25. 
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SEQ ID NO: 1 83 is the full-length DNA sequence of TbH4-XPl . 
SEQ ID NO: 1 84 is the predicted amino acid sequence of TbH4-XPl . 
SEQ ID NO: 1 85 is the predicted amino acid sequence encoded by the reverse 
complement of TbH4-XPl . 
5 SEQ ID NO: 1 86 is a first predicted amino acid sequence encoded by XP36. 

SEQ ID NO: 1 87 is a second predicted amino acid sequence encoded by XP36. 
SEQ ID NO: 188 is the predicted amino acid sequence encoded by the reverse 





complement of XP36. 




SEQ ID NO: 


189 is the 


10 


SEQ ID NO: 


1 90 is the 




SEQ ID NO: 


191 is the 




SEQ ID NO: 


192 is the 




SEQ ID NO: 


1 93 is the 




SEQ ID NO: 


1 94 is the 


15 


SEQ ID NO: 


195 is the 




SEQ ID NO: 


196 is the 




SEQ ID NO: 


197 is the 




SEQ ID NO: 


198 is the 




SEQ ID NO: 


199 is the 


20 


SEQ ID NO: 


200 is the 




SEQ ID NO: 


201 is the 




SEQ ID NO: 


202 is the 




SEQ ID NO: 


203 is the 




SEQ ID NO: 


204 is the 


25 


SEQ ID NO: 


205-212 



protein containing TbRa3, 38 kD, Tb38-1 and DPEP (hereinafter referred to as 
TbF-2). 

SEQ ID NO: 213 is the DNA sequence of the fusion protein TbF-2. 
SEQ ID NO: 214 is the amino acid sequence of the fusion protein TbF-2. 



30 



BNSDOCID: <WO 9816646A2_I_> 



PCTAJS97/18293 

WO 98/16646 

14 

DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is generally directed to 
compositions and methods for preventing, treating and diagnosing tuberculosis. The 
compositions of the subject invention include polypeptides that comprise at least one 
5 immunogenic portion of a M. tuberculosis antigen, or a variant of such an antigen that 
differs only in conservative substitutions and/or modifications. Polypeptides within the 
scope of the present invention include, but are not limited to, immunogenic soluble 
M. tuberculosis antigens. A "soluble M. tuberculosis antigen" is a protein of 
M. tuberculosis origin that is present in M. tuberculosis culture filtrate. As used herein, 
10 the term "polypeptide" encompasses amino acid chains of any length, including full 
length proteins {i.e., antigens), wherein the amino acid residues are linked by covalent 
peptide bonds. Thus, a polypeptide comprising an immunogenic portion of one of the 
above antigens may consist entirely of the immunogenic portion, or may contain 
additional sequences. The additional sequences may be derived from the native 
1 5 M. tuberculosis antigen or may be heterologous, and such sequences may (but need not) 
be immunogenic. 

"Immunogenic," as used herein, refers to the ability to elicit an immune 
response (e.g., cellular) in a patient, such as a human, and/or in a biological sample. In 
particular, antigens that are immunogenic (and immunogenic portions or other variants 

20 of such antigens) are capable of stimulating cell proliferation, interleukin-12 production 
and/or interferon-y production in biological samples comprising one or more cells 
selected from the group of T cells, NK cells, B cells and macrophages, where the cells 
are derived from an M. tuberculosis-immune individual. Polypeptides comprising at 
least an immunogenic portion of one or more M. tuberculosis antigens may generally be 

25 used to detect tuberculosis or to induce protective immunity against tuberculosis in a 
patient. 

The compositions and methods of this invention also encompass variants 
of the above polypeptides. A "variant," as used herein, is a polypeptide that differs 
from the native antigen only in conservative substitutions and/or modifications, such 
30 that the ability of the polypeptide to induce an immune response is retained. Such 
variants may generally be identified by modifying one of the above polypeptide 
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sequences, and evaluating the immunogenic properties of the modified polypeptide 
using, for example, the representative procedures described herein. 

A "conservative substitution" is one in which an amino acid is 
substituted for another amino acid that has similar properties, such that one skilled in 
5 the art of peptide chemistry would expect the secondary structure and hydropathic 
nature of the polypeptide to be substantially unchanged. In general, the following 
groups of amino acids represent conservative changes: (1) ala, pro, gly, glu, asp, gin, 
asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, ile, leu, met, ala, phe; (4) lys, arg, his; and 
(5) phe, tyr, trp, his. 

10 Variants may also (or alternatively) be modified by, for example, the 

deletion or addition of amino acids that have minimal influence on the immunogenic 
properties, secondary structure and hydropathic nature of the polypeptide. For example, 
a polypeptide may be conjugated to a signal (or leader) sequence at the N-terminal end 
of the protein which co-translationally or post-translationally directs transfer of the 

1 5 protein. The polypeptide may also be conjugated to a linker or other sequence for ease 
of synthesis, purification or identification of the polypeptide {e.g., poly-His), or to 
enhance binding of the polypeptide to a solid support. For example, a polypeptide may 
be conjugated to an immunoglobulin Fc region. 

In a related aspect, combination polypeptides are disclosed. A 

20 "combination polypeptide" is a polypeptide comprising at least one of the above 
immunogenic portions and one or more additional immunogenic M tuberculosis 
sequences, which are joined via a peptide linkage into a single amino acid chain. The 
sequences may be joined directly {i.e., with no intervening amino acids) or may be 
joined by way of a linker sequence {e.g., Gly-Cys-Gly) that does not significantly 

25 diminish the immunogenic properties of the component polypeptides. 

In general, M. tuberculosis antigens, and DNA sequences encoding such 
antigens, may be prepared using any of a variety of procedures. For example, soluble 
antigens may be isolated from M tuberculosis culture filtrate by procedures known to 
those of ordinary skill in the art, including ani on-exchange and reverse phase 

30 chromatography. Purified antigens are then evaluated for their ability to elicit an 
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appropriate immune response (e.g., cellular) using, for example, the representative 
methods described herein. Immunogenic antigens may then be partially sequenced 
using techniques such as traditional Edman chemistry. See Edman and Berg, Eur. J. 
Biochem. 50:116-132, 1967. 
5 Immunogenic antigens may also be produced recombinantly using a 

DNA sequence that encodes the antigen, which has been inserted into an expression 
vector and expressed in an appropriate host. DNA molecules encoding soluble antigens 
may be isolated by screening an appropriate M. tuberculosis expression library with 
anti-sera (e.g., rabbit) raised specifically against soluble M. tuberculosis antigens. DNA 
10 sequences encoding antigens that may or may not be soluble may be identified by 
screening an appropriate M. tuberculosis genomic or cDNA expression library with sera 
obtained from patients infected with M. tuberculosis. Such screens may generally be 
performed using techniques well known to those of ordinary skill in the art, such as 
those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold 
1 5 Spring Harbor Laboratories, Cold Spring Harbor, NY, 1 989. 

DNA sequences encoding soluble antigens may also be obtained by 
screening an appropriate M. tuberculosis cDNA or genomic DNA library for DNA 
sequences that hybridize to degenerate oligonucleotides derived from partial amino acid 
sequences of isolated soluble antigens. Degenerate oligonucleotide sequences for use in 
20 such a screen may be designed and synthesized, and the screen may be performed, as 
described (for example) in Sambrook et al., Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989 (and references cited 
therein). Polymerase chain reaction (PCR) may also be employed, using the above 
oligonucleotides in methods well known in the art, to isolate a nucleic acid probe from a 
25 cDN A or genomic library. The library screen may then be performed using the isolated 
probe. 

Alternatively, genomic or cDNA libraries derived from M. tuberculosis 
may be screened directly using peripheral blood mononuclear cells (PBMCs) or T cell 
lines or clones derived from one or more M. tuberculosis-immune individuals. In 
30 general, PBMCs and/or T cells for use in such screens may be prepared as described 
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below. Direct library screens may generally be performed by assaying pools of 
expressed recombinant proteins for the ability to induce proliferation and/or interferon-y 
production in T cells derived from an M. tuberculosis-immune individual. 
Alternatively, potential T cell antigens may be first selected based on antibody 
5 reactivity, as described above. 

Regardless of the method of preparation, the antigens (and immunogenic 
portions thereof) described herein (which may or may not be soluble) have the ability to 
induce an immunogenic response. More specifically, the antigens have the ability to 
induce proliferation and/or cytokine production (i.e., interferon-y and/or interleukin-12 

10 production) in T cells, NK cells, B cells and/or macrophages derived from an 
M. tuberculosis-immune individual. The selection of cell type for use in evaluating an 
immunogenic response to a antigen will, of course, depend on the desired response. For 
example, interleukin-12 production is most readily evaluated using preparations 
containing B cells and/or macrophages. An M tuberculosis-immune individual is one 

15 who is considered to be resistant to the development of tuberculosis by virtue of having 
mounted an effective T cell response to M. tuberculosis (i.e., substantially free of 
disease symptoms). Such individuals may be identified based on a strongly positive 
(i.e., greater than about 10 mm diameter induration) intradermal skin test response to 
tuberculosis proteins (PPD) and an absence of any signs or symptoms of tuberculosis 

20 disease. T cells, NK cells, B cells and macrophages derived from M. tuberculosis- 
immune individuals may be prepared using methods known to those of ordinary skill in 
the art. For example, a preparation of PBMCs (i.e., peripheral blood mononuclear cells) 
may be employed without further separation of component cells. PBMCs may 
generally be prepared, for example, using density centrifugation through Ficoll™ 

25 (Winthrop Laboratories, NY). T cells for use in the assays described herein may also be 
purified directly from PBMCs. Alternatively, an enriched T cell line reactive against 
mycobacterial proteins, or T cell clones reactive to individual mycobacterial proteins, 
may be employed. Such T cell clones may be generated by, for example, culturing 
PBMCs from M. tuberculosis-immune individuals with mycobacterial proteins for a 

30 period of 2-4 weeks. This allows expansion of only the mycobacterial protein-specific 



BNSDOCID: <WO 9816646A2J_> 



WO 98/16646 



18 



PCTAJS97/18293 



T cells, resulting in a line composed solely of such cells. These cells may then be 
cloned and tested with individual proteins, using methods known to those of ordinary 
skill in the art, to more accurately define individual T cell specificity. In general, 
antigens that test positive in assays for proliferation and/or cytokine production (i.e., 
5 interferon-y and/or interleukin- 1 2 production) performed using T cells, NK cells, B cells 
and/or macrophages derived from an M. tuberculosis-immune individual are considered 
immunogenic. Such assays may be performed, for example, using the representative 
procedures described below. Immunogenic portions of such antigens may be identified 
using similar assays, and may be present within the polypeptides described herein. 
10 The ability of a polypeptide (e.g., an immunogenic antigen, or a portion 

or other variant thereof) to induce cell proliferation is evaluated by contacting the cells 
(e.g., T cells and/or NK cells) with the polypeptide and measuring the proliferation of 
the cells. In general, the amount of polypeptide that is sufficient for evaluation of about 
10 5 cells ranges from about lOng/mL to about 100 pg/mL and preferably is about 
15 10 ng/mL. The incubation of polypeptide with cells is typically performed at 37°C for 
about six days. Following incubation with polypeptide, the cells are assayed for a 
proliferative response, which may be evaluated by methods known to those of ordinary 
skill in the art, such as exposing cells to a pulse of radiolabeled thymidine and 
measuring the incorporation of label into cellular DNA. In general, a polypeptide that 
20 results in at least a three fold increase in proliferation above background (i.e., the 
proliferation observed for cells cultured without polypeptide) is considered to be able to 

induce proliferation. 

The ability of a polypeptide to stimulate the production of interferon-y 
and/or interleukin- 12 in cells may be evaluated by contacting the cells with the 

25 polypeptide and measuring the level of interferon-y or interleukin- 12 produced by the 
cells. In general, the amount of polypeptide that is sufficient for the evaluation of about 
10 s cells ranges from about 10 ng/mL to about 100 ug/mL and preferably is about 
10 pg/mL. The polypeptide may, but need not, be immobilized on a solid support, such 
as a bead or a biodegradable microsphere, such as those described in U.S. Patent 

30 Nos. 4,897,268 and 5,075,109. The incubation of polypeptide with the cells is typically 
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performed at 37°C for about six days. Following incubation with polypeptide, the cells 
are assayed for interferon-y and/or interleukin-12 (or one or more subunits thereof), 
which may be evaluated by methods known to those of ordinary skill in the art, such as 
an enzyme-linked immunosorbent assay (ELISA) or, in the case of IL-12 P70 subunit, a 
5 bioassay such as an assay measuring proliferation of T cells. In general, a polypeptide 
that results in the production of at least 50 pg of interferon-y per mL of cultured 
supernatant (containing 10 4 -10 5 T cells per mL) is considered able to stimulate the 
production of interferon-y. A polypeptide that stimulates the production of at least 
10 pg/mL of IL-12 P70 subunit, and/or at least 100 pg/mL of IL-12 P40 subunit, per 10 5 

10 macrophages or B cells (or per 3x 10 5 PBMC) is considered able to stimulate the 
production of IL-12. 

In general, immunogenic antigens are those antigens that stimulate 
proliferation and/or cytokine production (i.e., interferon-y and/or interleukin-12 
production) in T cells, NK cells, B cells and/or macrophages derived from at least about 

15 25% of M tuberculosis-immune individuals. Among these immunogenic antigens, 
polypeptides having superior therapeutic properties may be distinguished based on the 
magnitude of the responses in the above assays and based on the percentage of 
individuals for which a response is observed. In addition, antigens having superior 
therapeutic properties will not stimulate proliferation and/or cytokine production in 

20 vitro in cells derived from more than about 25% of individuals that are not 
M. tuberculosis-immune, thereby eliminating responses that are not specifically due to 
M. tuberculosis-responsive cells. Those antigens that induce a response in a high 
percentage of T cell, NK cell, B cell and/or macrophage preparations from 
M. tuberculosis-immune individuals (with a low incidence of responses in cell 

25 preparations from other individuals) have superior therapeutic properties. 

Antigens with superior therapeutic properties may also be identified 
based on their ability to diminish the severity of M. tuberculosis infection in 
experimental animals, when administered as a vaccine. Suitable vaccine preparations 
for use on experimental animals are described in detail below. Efficacy may be 

30 determined based on the ability of the antigen to provide at least about a 50% reduction 
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in bacterial numbers and/or at least about a 40% decrease in mortality following 
experimental infection. Suitable experimental animals include mice, guinea pigs and 
primates. 

Antigens having superior diagnostic properties may generally be 
5 identified based on the ability to elicit a response in an intradermal skin test performed 
on an individual with active tuberculosis, but not in a test performed on an individual 
who is not infected with M. tuberculosis. Skin tests may generally be performed as 
described below, with a response of at least 5 mm induration considered positive. 

Immunogenic portions of the antigens described herein may be prepared 
10 and identified using well known techniques, such as those summarized in Paul, 
Fundamental Immunology, 3d ed., Raven Press, 1993, pp. 243-247 and references cited 
therein. Such techniques include screening polypeptide portions of the native antigen 
for immunogenic properties. The representative proliferation and cytokine production 
assays described herein may generally be employed in these screens. An immunogenic 
15 portion of a polypeptide is a portion that, within such representative assays, generates 
an immune response (e.g., proliferation, interferon-y production and/or interleukin-12 
production) that is substantially similar to that generated by the full length antigen. In 
other words, an immunogenic portion of an antigen may generate at least about 20%, 
and preferably about 100%, of the proliferation induced by the full length antigen in the 
20 model proliferation assay described herein. An immunogenic portion may also, or 
alternatively, stimulate the production of at least about 20%, and preferably about 
100%, of the interferon-y and/or interleukin-12 induced by the full length antigen in the 
model assay described herein. 

Portions and other variants of M. tuberculosis antigens may be generated 
25 by synthetic or recombinant means. Synthetic polypeptides having fewer than about 
100 amino acids, and generally fewer than about 50 amino acids, may be generated 
using techniques well known to those of ordinary skill in the art. For example, such 
polypeptides may be synthesized using any of the commercially available solid-phase 
techniques, such as the Merrifield solid-phase synthesis method, where amino acids are 
30 sequentially added to a growing amino acid chain. See Merrifield, J. Am. Chem. Soc. 
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55:2149-2146, 1963. Equipment for automated synthesis of polypeptides is 
commercially available from suppliers such as Applied BioSystems, Inc., Foster City, 
CA, and may be operated according to the manufacturer's instructions. Variants of a 
native antigen may generally be prepared using standard mutagenesis techniques, such 
5 as oligonucleotide-directed site-specific mutagenesis. Sections of the DNA sequence 
may also be removed using standard techniques to permit preparation of truncated 
polypeptides. 

Recombinant polypeptides containing portions and/or variants of a 
native antigen may be readily prepared from a DNA sequence encoding the polypeptide 

10 using a variety of techniques well known to those of ordinary skill in the art. For 
example, supernatants from suitable host/vector systems which secrete recombinant 
protein into culture media may be first concentrated using a commercially available 
filter. Following concentration, the concentrate may be applied to a suitable 
purification matrix such as an affinity matrix or an ion exchange resin. Finally, one or 

15 more reverse phase HPLC steps can be employed to further purify a recombinant 
protein. 

Any of a variety of expression vectors known to those of ordinary skill in 
the art may be employed to express recombinant polypeptides of this invention. 
Expression may be achieved in any appropriate host cell that has been transformed or 

20 transfected with an expression vector containing a DNA molecule that encodes a 
recombinant polypeptide. Suitable host cells include prokaryotes, yeast and higher 
eukaryotic cells. Preferably, the host cells employed are E. coli, yeast or a mammalian 
cell line such as COS or CHO. The DNA sequences expressed in this manner may 
encode naturally occurring antigens, portions of naturally occurring antigens, or other 

25 variants thereof. 

In general, regardless of the method of preparation, the polypeptides 
disclosed herein are prepared in substantially pure form. Preferably, the polypeptides 
are at least about 80% pure, more preferably at least about 90% pure and most 
preferably at least about 99% pure. In certain preferred embodiments, described in 
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detail below, the substantially pure polypeptides are incorporated into pharmaceutical 
compositions or vaccines for use in one or more of the methods disclosed herein. 

In certain specific embodiments, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of a soluble M. tuberculosis 
5 antigen having one of the following N-terminal sequences, or a variant thereof that 
differs only in conservative substitutions and/or modifications: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- 
Gln-Val-Val-Ala-Ala-Leu; (SEQ ID No. 120) 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 

10 Ser; (SEQ ID No. 121) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-GIu-Ala- 

Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 

Pro; (SEQ ID No. 123) 
15 ( e ) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; 

(SEQ ID No. 124) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID 
No. 125) 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Tlir-Ala-Ala-Ala-Ser- 

20 Pro-Pro-Ser; (SEQ ID No. 1 26) 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 

Gly; (SEQ ID No. 127) 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu- 
Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- 

25 Ala-Asn; (SEQ ID No. 128) 

(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 

Ser; (SEQ ID No. 134) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 

Asp; (SEQ ID No. 135) or 
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(1) AIa-Pro-Glu-Ser-Gly-A!a-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQIDNo. 136) 
wherein Xaa may be any amino acid 5 preferably a cysteine residue. A DNA sequence 
encoding the antigen identified as (g) above is provided in SEQ ID No. 52, and the 
5 polypeptide encoded by SEQ ID No. 52 is provided in SEQ ID No. 53. A DNA 
sequence encoding the antigen defined as (a) above is provided in SEQ ID No. 101; its 
deduced amino acid sequence is provided in SEQ ID No. 102. A DNA sequence 
corresponding to antigen (d) above is provided in SEQ ID No. 24 a DNA sequence 
corresponding to antigen (c) is provided in SEQ ID No. 25 and a DNA sequence 
10 corresponding to antigen (i) is provided in SEQ ID No. 99; its deduced amino acid 
sequence is provided in SEQ ID No. 100. 

In a further specific embodiment, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of an M tuberculosis antigen 
having one of the following N-terminal sequences, or a variant thereof that differs only 
1 5 in conservative substitutions and/or modifications: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly^Ile-Val-Pro-Gly-Lys- 

Ile-Asn-Val-His-Leu-Val; (SEQ ID No 137) or 
(n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- 
Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) 
20 wherein Xaa may be any amino acid, preferably a cysteine residue. 

In other specific embodiments, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of a soluble M. tuberculosis 
antigen (or a variant of such an antigen) that comprises one or more of the amino acid 
sequences encoded by (a) the DNA sequences of SEQ ID Nos.: 1, 2, 4-10, 13-25 and 
25 52; (b) the complements of such DNA sequences, or (c) DNA sequences substantially 
homologous to a sequence in (a) or (b). 

In further specific embodiments, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of a M. tuberculosis antigen 
(or a variant of such an antigen), which may or may not be soluble, that , comprises one 
30 or more of the amino acid sequences encoded by (a) the DNA sequences of SEQ ID 
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Nos.: 26-51, 138, 139, 163-183 and 201, (b) the complements of such DNA sequences 
or (c) DNA sequences substantially homologous to a sequence in (a) or (b). 

In the specific embodiments discussed above, the M. tuberculosis 
antigens include variants that are encoded by DNA sequences which are substantially 

5 homologous to one or more of DNA sequences specifically recited herein. "Substantial 
homology," as used herein, refers to DNA sequences that are capable of hybridizing 
under moderately stringent conditions. Suitable moderately stringent conditions include 
prewashing in a solution of 5X SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing 
at 50°C-65°C, 5X SSC, overnight or, in the case of cross-species homology at 45°C, 

10 0.5X SSC; followed by washing twice at 65°C for 20 minutes with each of 2X, 0.5X 
and 0.2X SSC containing 0.1% SDS). Such hybridizing DNA sequences are also 
within the scope of this invention, as are nucleotide sequences that, due to code 
degeneracy, encode an immunogenic polypeptide that is encoded by a hybridizing DNA 
sequence. 

1 5 i n a related aspect, the present invention provides fusion proteins 

comprising a first and a second inventive polypeptide or, alternatively, a polypeptide of 
the present invention and a known M tuberculosis antigen, such as the 38 kD antigen 
described in Andersen and Hansen, Infect. Immun. 57:2481-2488, 1989, (Genbank 
Accession No. M30046) or ESAT-6 (SEQ ID Nos. 103 and 104), together with variants 

20 of such fusion proteins. The fusion proteins of the present invention may also include a 
linker peptide between the first and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 
constructed using known recombinant DNA techniques to assemble separate DNA 
sequences encoding the first and second polypeptides into an appropriate expression 

25 vector. The 3' end of a DNA sequence encoding the first polypeptide is ligated, with or 
without a peptide linker, to the 5* end of a DNA sequence encoding the second 
polypeptide so that the reading frames of the sequences are in phase to permit mRNA 
translation of the two DNA sequences into a single fusion protein that retains the 
biological activity of both the first and the second polypeptides. 
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A peptide linker sequence may be employed to separate the first and the 
second polypeptides by a distance sufficient to ensure that each polypeptide folds into 
its secondary and tertiary structures. Such a peptide linker sequence is incorporated into 
the fusion protein using standard techniques well known in the art. Suitable peptide 
5 linker sequences may be chosen based on the following factors: (1) their ability to 
adopt a flexible extended conformation; (2) their inability to adopt a secondary structure 
that could interact with functional epitopes on the first and second polypeptides; and 
(3) the lack of hydrophobic or charged residues that might react with the polypeptide 
functional epitopes. Preferred peptide linker sequences contain Gly, Asn and Ser 

10 residues. Other near neutral amino acids, such as Thr and Ala may also be used in the 
linker sequence. Amino acid sequences which may be usefully employed as linkers 
include those disclosed in Maratea etal., Gene ¥0:39-46, 1985; Murphy etal., Proc. 
Natl Acad. Sci. USA 53:8258-8262, 1986; U.S. Patent No. 4,935,233 and U.S. Patent 
No. 4,751,180. The linker sequence may be from 1 to about 50 amino acids in length. 

15 Peptide sequences are not required when the first and second polypeptides have non- 
essential N-terminal amino acid regions that can be used to separate the functional 
domains and prevent steric interference. 

The ligated DNA sequences are operably linked to suitable 
transcriptional or translational regulatory elements. The regulatory elements 

20 responsible for expression of DNA are located only 5' to the DNA sequence encoding 
the first polypeptides. Similarly, stop codons require to end translation and 
transcription termination signals are only present 3' to the DNA sequence encoding the 
second polypeptide. 

In another aspect, the present invention provides methods for using one 

25 or more of the above polypeptides or fusion proteins (or DNA molecules encoding such 
polypeptides) to induce protective immunity against tuberculosis in a patient. As used 
herein, a "patient" refers to any warm-blooded animal, preferably a human. A patient 
may be afflicted with a disease, or may be free of detectable disease and/or infection. In 
other words, protective immunity may be induced to prevent or treat tuberculosis. 
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In this aspect, the polypeptide, fusion protein or DNA molecule is 
generally present within a pharmaceutical composition and/or a vaccine. 
Pharmaceutical compositions may comprise one or more polypeptides, each of which 
may contain one or more of the above sequences (or variants thereof), and a 
5 physiologically acceptable carrier. Vaccines may comprise one or more of the above 
polypeptides and a non-specific immune response enhancer, such as an adjuvant or a 
liposome (into which the polypeptide is incorporated). Such pharmaceutical 
compositions and vaccines may also contain other M tuberculosis antigens, either 
incorporated into a combination polypeptide or present within a separate polypeptide. 
10 Alternatively, a vaccine may contain DNA encoding one or more 

polypeptides as described above, such that the polypeptide is generated in situ. In such 
vaccines, the DNA may be present within any of a variety of delivery systems known to 
those of ordinary skill in the art, including nucleic acid expression systems, bacterial 
and viral expression systems. Appropriate nucleic acid expression systems contain the 
1 5 necessary DNA sequences for expression in the patient (such as a suitable promoter and 
terminating signal). Bacterial delivery systems involve the administration of a 
bacterium (such as Bacillus-Calmette-Guerrin) that expresses an immunogenic portion 
of the polypeptide on its cell surface. In a preferred embodiment, the DNA may be 
introduced using a viral expression system (e.g., vaccinia or other pox virus, retrovirus, 
20 or adenovirus), which may involve the use of a non-pathogenic (defective), replication 
competent virus. Techniques for incorporating DNA into such expression systems are 
well known to those of ordinary skill in the art. The DNA may also be "naked," as 
described, for example, in Ulmer et al., Science 259: 1745- 1749, 1993 and reviewed by 
Cohen, Science 259: 1691-1 692, 1993. The uptake of naked DNA may be increased by 
25 coating the DNA onto biodegradable beads, which are efficiently transported into the 
cells. 

In a related aspect, a DNA vaccine as described above may be 
administered simultaneously with or sequentially to either a polypeptide of the present 
invention or a known M. tuberculosis antigen, such as the 38 kD antigen described 
30 above. For example, administration of DNA encoding a polypeptide of the present 
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invention, either "naked" or in a delivery system as described above, may be followed 
by administration of an antigen in order to enhance the protective immune effect of the 
vaccine. 

Routes and frequency of administration, as well as dosage, will vary 
5 from individual to individual and may parallel those currently being used in 
immunization using BCG. In general, the pharmaceutical compositions and vaccines 
may be administered by injection (e.g., intracutaneous, intramuscular, intravenous or 
subcutaneous), intranasally (e.g., by aspiration) or orally. Between 1 and 3 doses may 
be administered for a 1-36 week period. Preferably, 3 doses are administered, at 

1 0 intervals of 3-4 months, and booster vaccinations may be given periodically thereafter. 
Alternate protocols may be appropriate for individual patients. A suitable dose is an 
amount of polypeptide or DNA that, when administered as described above, is capable 
of raising an immune response in an immunized patient sufficient to protect the patient 
from M. tuberculosis infection for at least 1-2 years. In general, the amount of 

1 5 polypeptide present in a dose (or produced in situ by the DNA in a dose) ranges from 
about 1 pg to about 100 mg per kg of host, typically from about 10 pg to about 1 mg, 
and preferably from about 100 pg to about 1 jag. Suitable dose sizes will vary with the 
size of the patient, but will typically range from about 0.1 mL to about 5 mL. 

While any suitable carrier known to those of ordinary skill in the art may 

20 be employed in the pharmaceutical compositions of this invention, the type of carrier 
will vary depending on the mode of administration. For parenteral administration, such 
as subcutaneous injection, the carrier preferably comprises water, saline, alcohol, a fat, a 
wax or a buffer. For oral administration, any of the above carriers or a solid carrier, 
such as mannitol, lactose, starch, magnesium stearate, sodium saccharine, talcum, 

25 cellulose, glucose, sucrose, and magnesium carbonate, may be employed. 
Biodegradable microspheres (e.g., polylactic galactide) may also be employed as 
carriers for the pharmaceutical compositions of this invention. Suitable biodegradable 
microspheres are disclosed, for example, in U.S. Patent Nos. 4,897,268 and 5,075,109. 

Any of a variety of adjuvants may be employed in the vaccines of this 

30 invention to nonspecifically enhance the immune response. Most adjuvants contain a 



BNSDOCID: <WO 0816646A2J_> 



WO 98/16646 



28 



PCT/US97/18293 



substance designed to protect the antigen from rapid catabolism, such as aluminum 
hydroxide or mineral oil, and a nonspecific stimulator of immune responses, such as 
lipid A, Bortadella pertussis or Mycobacterium tuberculosis. Suitable adjuvants are 
commercially available as, for example, Freund's Incomplete Adjuvant and Freund's 
5 Complete Adjuvant (Difco Laboratories) and Merck Adjuvant 65 (Merck and 
Company, Inc., Railway, NJ). Other suitable adjuvants include alum, biodegradable 
microspheres, monophosphoryl lipid A and quil A. 

In another aspect, this invention provides methods for using one or more 
of the polypeptides described above to diagnose tuberculosis using a skin test. As used 
1 0 herein, a "skin test" is any assay performed directly on a patient in which a delayed-type 
hypersensitivity (DTH) reaction (such as swelling, reddening or dermatitis) is measured 
following intradermal injection of one or more polypeptides as described above. Such 
injection may be achieved using any suitable device sufficient to contact the 
polypeptide or polypeptides with dermal cells of the patient, such as a tuberculin 
15 syringe or 1 mL syringe. Preferably, the reaction is measured at least 48 hours after 
injection, more preferably 48-72 hours. 

The DTH reaction is a cell-mediated immune response, which is greater 
in patients that have been exposed previously to the test antigen (i.e., the immunogenic 
portion of the polypeptide employed, or a variant thereof). The response may be 
20 measured visually, using a ruler. In general, a response that is greater than about 0.5 cm 
in diameter, preferably greater than about 1.0 cm in diameter, is a positive response, 
indicative of tuberculosis infection, which may or may not be manifested as an active 
disease. 

The polypeptides of this invention are preferably formulated, for use in a 
25 skin test, as pharmaceutical compositions containing a polypeptide and a 
physiologically acceptable carrier, as described above. Such compositions typically 
contain one or more of the above polypeptides in an amount ranging from about 1 \ig to 
about 100 jxg, preferably from about 10 ug to about 50 [ig in a volume of 0.1 mL. 
Preferably, the carrier employed in such pharmaceutical compositions is a saline 
30 solution with appropriate preservatives, such as phenol and/or Tween 80™. 
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In a preferred embodiment, a polypeptide employed in a skin test is of 
sufficient size such that it remains at the site of injection for the duration of the reaction 
period. In general, a polypeptide that is at least 9 amino acids in length is sufficient. 
The polypeptide is also preferably broken down by macrophages within hours of 
5 injection to allow presentation to T-cells. Such polypeptides may contain repeats of one 
or more of the above sequences and/or other immunogenic or nonimmunogenic 
sequences. 

The following Examples are offered by way of illustration and not by 
10 way of limitation. 



EXAMPLES 



EXAMPLE 1 

15 Purification and Characterization of Polypeptides 

fromM tuberculosis Culture Filtrate 

This example illustrates the preparation of M. tuberculosis soluble 
polypeptides from culture filtrate. Unless otherwise noted, all percentages in the 
20 following example are weight per volume. 

M. tuberculosis (either H37Ra, ATCC No. 25177, or H37Rv, ATCC 
No. 25618) was cultured in sterile GAS media at 37°C for fourteen days. The media 
was then vacuum filtered (leaving the bulk of the cells) through a 0.45 \x filter into a 
sterile 2.5 L bottle. The media was next filtered through a 0.2 \i filter into a sterile 4 L 
25 bottle and NaN 3 was added to the culture filtrate to a concentration of 0.04%. The 
bottles were then placed in a 4°C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L 
reservoir that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell 
which had been rinsed with ethanol and contained a 10 5 000 kDa MWCO membrane. 
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The pressure was maintained at 60 psi using nitrogen gas. This procedure reduced the 
1 2 L volume to approximately 50 ml. 

The culture filtrate was dialyzed into 0.1% ammonium bicarbonate using 
a 8,000 kDa MWCO cellulose ester membrane, with two changes of ammonium 

5 bicarbonate solution. Protein concentration was then determined by a commercially 
available BCA assay (Pierce, Rockford, IL). 

The dialyzed culture filtrate was then lyophilized, and the polypeptides 
resuspended in distilled water. The polypeptides were dialyzed against 0.01 mM 1,3 
bis[tris(hydroxymethyl)-methylamino]propane, pH 7.5 (Bis-Tris propane buffer), the 

10 initial conditions for anion exchange chromatography. Fractionation was performed 
using gel profusion chromatography on a POROS 146 II Q/M anion exchange column 
4.6 mm x 100 mm (Perseptive BioSystems, Framingham, MA) equilibrated in 0.01 mM 
Bis-Tris propane buffer pH 7.5. Polypeptides were eluted with a linear 0-0.5 M NaCl 
gradient in the above buffer system. The column eluent was monitored at a wavelength 

15 of220nm. 

The pools of polypeptides eluting from the ion exchange column were 
dialyzed against distilled water and lyophilized. The resulting material was dissolved in 
0.1% trifluoroacetic acid (TFA) pH 1 .9 in water, and the polypeptides were purified on 
a Delta-Pak CI 8 column (Waters, Milford, MA) 300 Angstrom pore size, 5 micron 
20 particle size (3.9 x 150 mm). The polypeptides were eluted from the column with a 
linear gradient from 0-60% dilution buffer (0.1% TFA in acetonitrile). The flow rate 
was 0.75 ml/minute and the HPLC eluent was monitored at 214 nm. Fractions 
containing the eluted polypeptides were collected to maximize the purity of the 
individual samples. Approximately 200 purified polypeptides were obtained. 

25 The purified polypeptides were then screened for the ability to induce T- 

cell proliferation in PBMC preparations. The PBMCs from donors known to be PPD 
skin test positive and whose T-cells were shown to proliferate in response to PPD and 
crude soluble proteins from MTB were cultured in medium comprising RPM1 1640 
supplemented with 10% pooled human serum and 50 ug/ml gentamicin. Purified 

30 polypeptides were added in duplicate at concentrations of 0.5 to 10 ug/mL. After six 
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days of culture in 96-wcII round-bottom plates in a volume of 200 \il 9 50 \xl of medium 
was removed from each well for determination of IFN-y levels, as described below. 
The plates were then pulsed with 1 j.iCi/well of tritiated thymidine tor a further 18 
hours, harvested and tritium uptake determined using a gas scintillation counter. 
5 Fractions that resulted in proliferation in both replicates three fold greater than the 
proliferation observed in cells cultured in medium alone were considered positive. 

IFN-y was measured using an enzyme-linked immunosorbent assay 
(ELISA). ELISA plates were coated with a mouse monoclonal antibody directed to 
human IFN-y (PharMingen, San Diego, CA) in PBS for four hours at room temperature. 

10 Wells were then blocked with PBS containing 5% (W/V) non-fat dried milk for 1 hour 
at room temperature. The plates were then washed six times in PBS/0.2% TWEEN-20 
and samples diluted 1:2 in culture medium in the ELISA plates were incubated 
overnight at room temperature. The plates were again washed and a polyclonal rabbit 
anti-human IFN-y serum diluted 1:3000 in PBS/10% normal goat serum was added to 

15 each well. The plates were then incubated for two hours at room temperature, washed 
and horseradish peroxidase-coupled anti-rabbit IgG (Sigma Chemical So., St. Louis, 
MO) was added at a 1:2000 dilution in PBS/5% non-fat dried milk. After a further two 
hour incubation at room temperature, the plates were washed and TMB substrate added. 
The reaction was stopped after 20 min with 1 N sulfuric acid. Optical density was 

20 determined at 450 nm using 570 nm as a reference wavelength. Fractions that resulted 
in both replicates giving an OD two fold greater than the mean OD from cells cultured 
in medium alone, plus 3 standard deviations, were considered positive. 

For sequencing, the polypeptides were individually dried onto 
Biobrene™ (Perkin Elmer/Applied BioSystems Division, Foster City, CA) treated glass 

25 fiber filters. The filters with polypeptide were loaded onto a Perkin Elmer/Applied 
BioSystems Division Procise 492 protein sequencer. The polypeptides were sequenced 
from the amino terminal and using traditional Edman chemistry. The amino acid 
sequence was determined for each polypeptide by comparing the retention time of the 
PTH amino acid derivative to the appropriate PTH derivative standards. 
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Using the procedure described above, antigens having the following 

N-terminal sequences were isolated: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Xaa-Asn-Tyr-Gly- 

Gln-Val-Val-Ala-Ala-Leu; (SEQ ID No. 54) 
5 (b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 

Ser; (SEQ ID No. 55) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 

Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 56) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 

10 Pro; (SEQ ID No. 57) 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; 

(SEQ ID No. 58) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID 
No. 59) 

15 ( g ) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Ala- Ala- Ala-Ala- 

Pro-Pro-Ala; (SEQ ID No. 60) and 
(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 

Gly; (SEQ ID No. 61) 

wherein Xaa may be any amino acid. 

20 An additional antigen was isolated employing a microbore HPLC 

purification step in addition to the procedure described above. Specifically, 20 ul of a 
fraction comprising a mixture of antigens from the chromatographic purification step 
previously described, was purified on an Aquapore CI 8 column (Perkin Elmer/Applied 
Biosystems Division, Foster City, CA) with a 7 micron pore size, column size 1 mm x 

25 1 00 mm, in a Perkin Elmer/Applied Biosystems Division Model 1 72 HPLC. Fractions 
were eluted from the column with a linear gradient of 1%/minute of acetonitrile 
(containing 0.05% TFA) in water (0.05% TFA) at a flow rate of 80 ul/minute. The 
eluent was monitored at 250 ran. The original fraction was separated into 4 major peaks 
plus other smaller components and a polypeptide was obtained which was shown to 
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have a molecular weight of 1 2.054 Kd (by mass spectrometry) and the following N- 
terminal sequence: 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ma-Ala-Gln-Gln- 
Thr-Ser-Leu-Leu-Asn-Asn-Leu-Ala-Asp-Pro-Asp-Val-Ser-Phe- 
5 Ala-Asp (SEQ ID No. 62). 

This polypeptide was shown to induce proliferation and IFN-y production in PBMC 
preparations using the assays described above. 

Additional soluble antigens were isolated from M tuberculosis culture 
filtrate as follows. M. tuberculosis culture filtrate was prepared as described above. 

10 Following dialysis against Bis-Tris propane buffer, at pH 5.5, fractionation was 
performed using anion exchange chromatography on a Poros QE column 4.6 x 100 mm 
(Perseptive Biosystems) equilibrated in Bis-Tris propane buffer pH 5.5. Polypeptides 
were eluted with a linear 0-1.5 M NaCI gradient in the above buffer system at a flow 
rate of 10 ml/min. The column eluent was monitored at a wavelength of 214 nm. 

15 The fractions eluting from the ion exchange column were pooled and 

subjected to reverse phase chromatography using a Poros R2 column 4.6 x 100 mm 
(Perseptive Biosystems). Polypeptides were eluted from the column with a linear 
gradient from 0-100% acetonitrile (0.1% TFA) at a flow rate of 5 ml/min. The eluent 
was monitored at 214 nm. 

20 Fractions containing the eluted polypeptides were lyophilized and 

resuspended in 80 jlx] of aqueous 0.1% TFA and further subjected to reverse phase 
chromatography on a Vydac C4 column 4.6 x 150 mm (Western Analytical, Temecula, 
CA) with a linear gradient of 0-100% acetonitrile (0.1% TFA) at a flow rate of 2 
ml/min. Eluent was monitored at 214 nm. 

25 The fraction with biological activity was separated into one major peak 

plus other smaller components. Western blot of this peak onto PVDF membrane 
revealed three major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These 
polypeptides were determined to have the following N-terminal sequences, respectively: 

G) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Tlir-Ile-Lys-Val-Thj*-Asp-Ala- 
30 Ser; (SEQ ID No. 134) 
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(k) Ala-GIy-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 

Asp; (SEQ ID No. 135) and 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 

Gly; (SEQ ID No. 136), wherein Xaa may be any amino acid. 
5 Using the assays described above, these polypeptides were shown to induce 
proliferation and IFN-y production in PBMC preparations. Figs. 1A and B show the 
results of such assays using PBMC preparations from a first and a second donor, 
respectively. 

DNA sequences that encode the antigens designated as (a), (c), (d) and 
10 (g) above were obtained by screening a genomic M. tuberculosis library using 32 P end 
labeled degenerate oligonucleotides corresponding to the N-terminal sequence and 
containing M. tuberculosis codon bias. The screen performed using a probe 
corresponding to antigen (a) above identified a clone having the sequence provided in 
SEQ ID No. 101. The polypeptide encoded by SEQ ID No. 101 is provided in SEQ ID 
15 No. 102. The screen performed using a probe corresponding to antigen (g) above 
identified a clone having the sequence provided in SEQ ID No. 52. The polypeptide 
encoded by SEQ ID No. 52 is provided in SEQ ID No. 53. The screen performed 
using a probe corresponding to antigen (d) above identified a clone having the sequence 
provided in SEQ ID No. 24, and the screen performed with a probe corresponding to 
20 antigen (c) identified a clone having the sequence provided in SEQ ID No: 25. 

The above amino acid sequences were compared to known amino acid 
sequences in the gene bank using the DNA STAR system. The database searched 
contains some 173,000 proteins and is a combination of the Swiss, PIR databases along 
with translated protein sequences (Version 87). No significant homologies to the amino 
25 acid sequences for antigens (a)-(h) and (1) were detected. 

The amino acid sequence for antigen (i) was found to be homologous to 
a sequence from M. leprae. The full length M. leprae sequence was amplified from 
genomic DNA using the sequence obtained from GENBANK. This sequence was then 
used to screen the M. tuberculosis library described below in Example 2 and a full 
30 length copy of the M. tuberculosis homologue was obtained (SEQ ID No. 99). 
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The amino acid sequence for antigen (j) was found to be homologous to 
a known M tuberculosis protein translated from a DNA sequence. To the best of the 
inventors' knowledge, this protein has not been previously shown to possess T-cell 
stimulatory activity. The amino acid sequence for antigen (k) was found to be related to 
5 a sequence from M leprae. 

In the proliferation and IFN-y assays described above, using three PPD 
positive donors, the results for representative antigens provided above are presented in 
Table 1 : 

10 TABLE 1 

Results of PBMC Proliferation and IFN-y Assays 



Sequence 


Proliferation 


IFN-y 


(a) 


+ 




(c) 


+++ 


+++ 


(d) 


++ 


++ 


(g) 


+++ 


+++ 


(h) 


+++ 


+++ i 



In Table 1 , responses that gave a stimulation index (SI) of between 2 and 
15 4 (compared to cells cultured in medium alone) were scored as +, an SI of 4-8 or 2-4 at 
a concentration of 1 |ig or less was scored as ++ and an SI of greater than 8 was scored 
as +++. The antigen of sequence (i) was found to have a high SI (+++) for one donor 
and lower SI (++ and +) for the two other donors in both proliferation and IFN-y assays. 
These results indicate that these antigens are capable of inducing proliferation and/or 
20 interferon-y production. 
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EXAMPLE 2 

I jse of Patient Sera to Isolate M. Tuberculosis Antigens 

This example illustrates the isolation of antigens from M tuberculosis 
5 lysate by screening with serum from M tuberculosis-infected individuals. 

Dessicated M tuberculosis H37Ra (Difco Laboratories) was added to a 
2% NP40 solution, and alternately homogenized and sonicated three times. The 
resulting suspension was centrifuged at 13,000 rpm in microfiige tubes and the 
supernatant put through a 0.2 micron syringe filter. The filtrate was bound to Macro 
10 Prep DEAE beads (BioRad, Hercules, CA). The beads were extensively washed with 
20 mM Tris pH 7.5 and bound proteins eluted with 1M NaCl. The 1M NaCl elute was 
dialyzed overnight against 10 mM Tris, pH 7.5. Dialyzed solution was treated with 
DNase and RNase at 0.05 mg/ml for 30 min. at room temperature and then with a-D- 
mannosidase, 0.5 U/mg at pH 4.5 for 3-4 hours at room temperature. After returning to 
15 pH 7.5, the material was fractionated via FPLC over a Bio Scale-Q-20 column 
(BioRad). Fractions were combined into nine pools, concentrated in a Centriprep 10 
(Amicon, Beverley, MA) and then screened by Western blot for serological activity 
using a serum pool from M tuberculosis-infected patients which was not 
immunoreactive with other antigens of the present invention. 
20 The most reactive fraction was run in SDS-PAGE and transferred to 

PVDF. A band at approximately 85 Kd was cut out yielding the sequence: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID No. 137), wherein Xaa may 
be any amino acid. 

25 Comparison of this sequence with those in the gene bank as described 

above, revealed no significant homologies to known sequences. 

A DNA sequence that encodes the antigen designated as (m) above was 
obtained by screening a genomic M. tuberculosis Erdman strain library using labeled 
degenerate oligonucleotides corresponding to the N-terminal sequence of SEQ ID 

30 NO: 137. A clone was identified having the DNA sequence provided in SEQ ID NO: 
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203. This sequence was found to encode the amino acid sequence provided in SEQ ID 
NO: 204. Comparison of these sequences with those in the gcnebank revealed some 
similarity to sequences previously identified in M. tuberculosis and M. bovis. 

5 EXAMPLE 3 

Preparation of DNA Sequences Encoding M. tuberculosis Antigens 

This example illustrates the preparation of DNA sequences encoding 
M tuberculosis antigens by screening a M tuberculosis expression library with sera 
10 obtained from patients infected with M tuberculosis, or with anti-sera raised against 
soluble M tuberculosis antigens. 

A. Preparation of M. tuberculosis Soluble Antigens using Rabbit Anti- 
sera RAISED AGAINST M. TUBERCULOSIS SUPERNATANT 

15 Genomic DNA was isolated from the M. tuberculosis strain H37Ra. The 

DNA was randomly sheared and used to construct an expression library using the 
Lambda ZAP expression system (Stratagene, La Jolla, CA). Rabbit anti-sera was 
generated against secretory proteins of the M. tuberculosis strains H37Ra, H37Rv and 
Erdman by immunizing a rabbit with concentrated supernatant of the M. tuberculosis 

20 cultures. Specifically, the rabbit was first immunized subcutaneously with 200 jag of 
protein antigen in a total volume of 2 ml containing 10 jag muramyl dipeptide 
(Calbiochem, La Jolla, CA) and 1 ml of incomplete Freund's adjuvant. Four weeks later 
the rabbit was boosted subcutaneously with 100 jag antigen in incomplete Freund's 
adjuvant. Finally, the rabbit was immunized intravenously four weeks later with 50 ^g 

25 protein antigen. The anti-sera were used to screen the expression library as described in 
Sambrook etal., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratories, Cold Spring Harbor, NY, 1989. Bacteriophage plaques expressing 
immunoreactive antigens were purified. Phagemid from the plaques was rescued and 
the nucleotide sequences of the M. tuberculosis clones deduced. 

30 Thirty two clones were purified. Of these, 25 represent sequences that 

have not been previously identified in human M. tuberculosis. Recombinant antigens 
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were expressed and purified antigens used in the immunological analysis described in 
Example 1 . Proteins were induced by IPTG and purified by gel elution, as described in 
Skeiky etal., J. Exp. Med. 757:1527-1537, 1995. Representative sequences of DNA 
molecules identified in this screen are provided in SEQ ID Nos.: 1-25. The 

5 corresponding predicted amino acid sequences are shown in SEQ ID Nos. 63-87. 

On comparison of these sequences with known sequences in the gene 
bank using the databases described above, it was found that the clones referred to 
hereinafter as TbRA2A, TbRA16, TbRA18, and TbRA29 (SEQ ID Nos. 76, 68, 70, 75) 
show some homology to sequences previously identified in Mycobacterium leprae but 

10 not in M tuberculosis. TbRAl 1, TbRA26, TbRA28 and TbDPEP (SEQ ID Nos.: 65, 
73, 74, 53) have been previously identified in M tuberculosis. No significant 
homologies were found to TbRAl, TbRA3, TbRA4, TbRA9, TbRAl 0, TbRA13, 
TbRAl 7, TbRal9, TbRA29, TbRA32, TbRA36 and the overlapping clones TbRA35 
and TbRAl 2 (SEQ ID Nos. 63, 77, 81, 82, 64, 67, 69, 71, 75, 78, 80, 79, 66). The 

1 5 clone TbRa24 is overlapping with clone TbRa29. 

The results of PBMC proliferation and interferon-y assays performed on 
representative recombinant antigens, and using T-cell preparations from several 
different M tuberculosis- immune patients, are presented in Tables 2 and 3, 
respectively. 
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In Tables 2 and 3, responses that gave a stimulation index (SI) of 
between 1.2 and 2 (compared to cells cultured in medium alone) were scored as ±, a SI 
of 2-4 was scored as +, as SI of 4-8 or 2-4 at a concentration of 1 |ag or less was scored 
as ++ and an SI of greater than 8 was scored as in addition, the effect of 

5 concentration on proliferation and interferon-y production is shown for two of the above 
antigens in the attached Figure. For both proliferation and interferon-y production, 
TbRa3 was scored as ++ and TbRa9 as +. 

These results indicate that these soluble antigens can induce proliferation 
and/or interferon-y production in T-cells derived from an M tuberculosis-immune 
10 individual. 

B. Use of Sera From Patients having Pulmonary or Pleural Tuberculosis 
to Identify DNA Sequences Encoding M. tuberculosis Antigens 

The genomic DNA library described above, and an additional H37Rv 

15 library, were screened using pools of sera obtained from patients with active 
tuberculosis. To prepare the H37Rv library, M tuberculosis strain H37Rv genomic 
DNA was isolated, subjected to partial Sau3A digestion and used to construct an 
expression library using the Lambda Zap expression system (Stratagene, La Jolla, Ca). 
Three different pools of sera, each containing sera obtained from three individuals with 

20 active pulmonary or pleural disease, were used in the expression screening. The pools 
were designated TbL, TbM and TbH, referring to relative reactivity with H37Ra lysate 
(i.e., TbL = low reactivity, TbM = medium reactivity and TbH = high reactivity) in both 
ELISA and immunoblot format. A fourth pool of sera from seven patients with active 
pulmonary tuberculosis was also employed. All of the sera lacked increased reactivity 

25 with the recombinant 38 kD M. tuberculosis H37Ra phosphate-binding protein. 

All pools were pre-adsorbed with E. coli lysate and used to screen the 
H37Ra and H37Rv expression libraries, as described in Sambrook et al., Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, 
NY ? 1989. Bacteriophage plaques expressing immunoreactive antigens were purified. 

30 Phagemid from the plaques was rescued and the nucleotide sequences of the 
M. tuberculosis clones deduced. 
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Thirty two clones were purified. Of these. 3 1 represented sequences that 
had not been previously identified in human M. tuberculosis. Representative sequences 
of the DNA molecules identified are provided in SEQ ID Nos.: 26-51 and 105. Of 
these, TbH-8-2 (SEQ. ID NO. 105) is a partial clone of TbH-8, and TbH-4 (SEQ. ID 

5 NO. 43) and TbH-4-FWD (SEQ. ID NO. 44) are non-contiguous sequences from the 
same clone. Amino acid sequences for the antigens hereinafter identified as Tb38-1, 
TbH-4, TbH-8, TbH-9, and TbH-12 are shown in SEQ ID Nos.: 88-92. Comparison of 
these sequences with known sequences in the gene bank using the databases identified 
above revealed no significant homologies to TbH-4, TbH-8, TbH-9 and TbM-3, 
1 0 although weak homologies were found to TbH-9. TbH-12 was found to be homologous 
to a 34 kD antigenic protein previously identified in M. paratuberculosis (Acc. 
No. S28515). Tb38-1 was found to be located 34 base pairs upstream of the open 
reading frame for the antigen ESAT-6 previously identified in M. bovis (Acc. 
No. U34848) and in M. tuberculosis (Sorensen etal., Infec. Immun. 63:1710-1717, 

15 1995). 

Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra 
library, were used to identify clones in an H37Rv library. Tb38-1 hybridized to 
Tb38-lF2,Tb38-lF3,Tb38-lF5 andTb38-lF6 (SEQ. ID NOS. 112, 113, 116, 118, and 
119). (SEQ ID NOS. 112 and 113 are non-contiguous sequences from clone Tb38- 

20 1F2.) Two open reading frames were deduced in Tb38-IF2; one corresponds to Tb37FL 
(SEQ. ID. NO. 114), the second, a partial sequence, may be the homologue of Tb38-1 
and is called Tb38-IN (SEQ. ID NO. 115). The deduced amino acid sequence of Tb38- 
1F3 is presented in SEQ. ID. NO. 117. A TbH-9 probe identified three clones in the 
H37Rv library: TbH-9-FL (SEQ. ID NO. 106), which may be the homologue of TbH-9 

25 (R37Ra), TbH-9-1 (SEQ. ID NO. 108), and TbH-9-4 (SEQ. ID NO. 110), all of which 
are highly related sequences to TbH-9. The deduced amino acid sequences for these 
three clones are presented in SEQ ID NOS. 107, 109 and 1 1 1 . 

Further screening of the M. tuberculosis genomic DNA library, as 
described above, resulted in the recovery of ten additional reactive clones, representing 

30 seven different genes. One of these genes was identified as the 38 Kd antigen discussed 



9816646A2 I _> 



WO 98/16646 



PCTAJS97/18293 



43 

above, one was determined to be identical to the 14Kd alpha crystallin heat shock 
protein previously shown to be present in M tuberculosis, and a third was determined 
to be identical to the antigen TbH-8 described above. The determined DNA sequences 
for the remaining five clones (hereinafter referred to as TbH-29, TbH-30, TbH-32 and 
5 TbH-33) are provided in SEQ ID NO: 138-141, respectively, with the corresponding 
predicted amino acid sequences being provided in SEQ ID NO: 142-145, respectively. 
The DNA and amino acid sequences for these antigens were compared with those in the 
gene bank as described above. No homologies were found to the 5' end of TbH-29 
(which contains the reactive open reading frame), although the 3' end of TbH-29 was 

10 found to be identical to the M. tuberculosis cosmid Y227. TbH-32 and TbH-33 were 
found to be identical to the previously identified M. tuberculosis insertion element 
1S61 10 and to the M tuberculosis cosmid Y50, respectively. No significant homologies 
to TbH-30 were found. 

Positive phagemid from this additional screening were used to infect E. 

15 coli XL-1 Blue MRF\ as described in Sambrook et al., supra. Induction of recombinant 
protein was accomplished by the addition of IPTG. Induced and uninduced ly sates 
were run in duplicate on SDS-PAGE and transferred to nitrocellulose filters. Filters 
were reacted with human M tuberculosis sera (1:200 dilution) reactive with TbH and a 
rabbit sera (1 :200 or 1:250 dilution) reactive with the N-terminal 4 Kd portion of lacZ. 

20 Sera incubations were performed for 2 hours at room temperature. Bound antibody was 
detected by addition of ,25 I-labeled Protein A and subsequent exposure to film for 
variable times ranging from 1 6 hours to 1 1 days. The results of the immunoblots are 
summarized in Table 4. 
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TABLE 4 

Human M. tb Anti-lacZ 
Antigen Sera Sera 

TbH-29 45 Kd 45 Kd 

TbH-30 No reactivity 29 Kd 

TbH-32 12 Kd 12Kd 

TbH-33 16Kd 16Kd 



Positive reaction of the recombinant human M. tuberculosis antigens 
with both the human M. tuberculosis sera and anti-lacZ sera indicate that reactivity of 
the human M. tuberculosis sera is directed towards the fusion protein. Antigens 
reactive with the anti-lacZ sera but not with the human M. tuberculosis sera may be the 
1 5 result of the human M. tuberculosis sera recognizing conformational epitopes, or the 
antigen-antibody binding kinetics may be such that the 2 hour sera exposure in the 
immunoblot is not sufficient. 



The results of T-cell assays performed on Tb38-L ESAT-6 and other 
20 representative recombinant antigens are presented in Tables 5A, B and 6, respectively, 
below: 

TABLE 5A 

R F.SI II .TS OF PBMf 1 . PROLIFEP ATION TO REPRESENTATIVE ANTIGENS 



Antigen 


Donor 
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TABLE 5B 

Results of PBMC Interferon-v Production to Representative Antigens 



Antigen 


Donor 




1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


1 1 


Tb38.1 


+++ 


+ 
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+++ 




++ 




+++ 


+++ 
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-H-+ 
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+ 
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+++ 
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++ 


++ 
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++ 



5 

TABLE 6 

Summary of T-cell Responses to Representative Antigens 



Antigen 


Proliferation 


lnterferon-y 


total 


patient 4 


patient 5 


patient 6 


patient 4 
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0 



10 These results indicate that both the inventive M. tuberculosis antigens 

and ESAT-6 can induce proliferation and/or interferon-y production in T-cells derived 
from an M tuberculosis-immune individual. To the best of the inventors 1 knowledge, 
ESAT-6 has not been previously shown to stimulate human immune responses 

A set of six overlapping peptides covering the amino acid sequence of 

15 the antigen Tb38-1 was constructed using the method described in Example 6. The 
sequences of these peptides, hereinafter referred to as pep 1-6, are provided in SEQ ID 
Nos. 93-98, respectively. The results of T-cell assays using these peptides are shown in 
Tables 7 and 8. These results confirm the existence, and help to localize T-cell epitopes 
within Tb38-1 capable of inducing proliferation and interferon-y production in T-cells 

20 derived from an M. tuberculosis immune individual. 
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Studies were undertaken lo determine whether the antigens TbH-9 and Tb38-1 
represent cellular proteins or are secreted into M. tuberculosis culture media. In the first 
study, rabbit sera were raised against A) secretory proteins of M. tuberculosis, B) the known 
secretory recombinant M. tuberculosis antigen 85b, C) recombinant Tb38-1 and D) 
5 recombinant TbH-9, using protocols substantially the same as that as described in Example 
3A. Total M. tuberculosis lysate, concentrated supernatant of M tuberculosis cultures and 
the recombinant antigens 85b, TbH-9 and Tb38-1 were resolved on denaturing gels, 
immobilized on nitrocellulose membranes and duplicate blots were probed using the rabbit 
sera described above. 

10 The results of this analysis using control sera (panel I) and antisera (panel II) 

against secretory proteins, recombinant 85b, recombinant Tb38-1 and recombinant TbH-9 are 
shown in Figures 3A-D, respectively, wherein the lane designations are as follows: 1) 
molecular weight protein standards; 2) 5 ug of M. tuberculosis lysate; 3) 5 ug secretory 
proteins; 4) 50 ng recombinant Tb38-1; 5) 50 ng recombinant TbH-9; and 6) 50 ng 

15 recombinant 85b. The recombinant antigens were engineered with six terminal histidine 
residues and would therefore be expected to migrate with a mobility approximately 1 kD 
larger that the native protein. In Figure 3D, recombinant TbH-9 is lacking approximately 10 
kD of the full-length 42 kD antigen, hence the significant difference in the size of the 
immunoreactive native TbH-9 antigen in the lysate lane (indicated by an arrow). These 

20 results demonstrate that Tb38-1 and TbH-9 are intracellular antigens and are not actively 

secreted by M. tuberculosis. 

The finding that TbH-9 is an intracellular antigen was confirmed by 
determining the reactivity of TbH-9-specific human T cell clones to recombinant TbH-9, 
secretory M. tuberculosis proteins and PPD. A TbH-9-specif,c T cell clone (designated 

25 131TbH-9) was generated from PBMC of a healthy PPD-positive donor. The proliferative 
response of 131TbH-9 to secretory proteins, recombinant TbH-9 and a control M. 
tuberculosis antigen, TbRall, was determined by measuring uptake of tritiated thymidine, as 
described in Example 1. As shown in Figure 4A, the clone 131TbH-9 responds specifically 
to TbH-9, showing that TbH-9 is not a significant component of M. tuberculosis secretory 

30 proteins. Figure 4B shows the production of IFN-y by a second TbH-9-specific T cell clone 
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(designated PPD 800-10) prepared from PBMC from a healthy PPD-positive donor, 
following stimulation of the T cell clone with secretory proteins, PPD or recombinant TbH-9. 
These results further confirm that TbH-9 is not secreted by M. tuberculosis. 

5 C. Use of Sera From Patients having Extrapulmonary Tuberculosis to Identify 
DNA Sequences Encoding M tuberculosis Antigens 

Genomic DNA was isolated from M tuberculosis Erdman strain, randomly 
sheared and used to construct an expression library employing the Lambda ZAP expression 

10 system (Stratagene, La Jolla, CA). The resulting library was screened using pools of sera 
obtained from individuals with extrapulmonary tuberculosis, as described above in Example 
3B, with the secondary antibody being goat anti-human IgG + A + M (H+L) conjugated with 
alkaline phosphatase. 

Eighteen clones were purified. Of these, 4 clones (hereinafter referred to as 

15 XP14, XP24, XP31 and XP32) were found to bear some similarity to known sequences. The 
determined DNA sequences for XP14, XP24 and XP31 are provided in SEQ ID Nos.: 156- 
158, respectively, with the 5 1 and V DNA sequences for XP32 being provided in SEQ ID 
Nos.: 159 and 160, respectively. The predicted amino acid sequence for XP14 is provided in 
SEQ ID No: 161. The reverse complement of XP14 was found to encode the amino acid 

20 sequence provided in SEQ ID No.: 162. 

Comparison of the sequences for the remaining 1 4 clones (hereinafter referred 
to as XP1-XP6, XP17-XP19, XP22, XP25, XP27, XP30 and XP36) with those in the 
genebank as described above, revealed no homologies with the exception of the 3' ends of 
XP2»and XP6 which were found to bear some homology to known M tuberculosis cosmids. 

25 The DNA sequences for XP27 and XP36 are shown in SEQ ID Nos.: 163 and 164, 
respectively, with the 5' sequences for XP4, XP5, XP17 and XP30 being shown in SEQ ID 
Nos: 165-168, respectively, and the 5' and 3' sequences for XP2, XP3, XP6, XP18, XP19, 
XP22 and XP25 being shown in SEQ ID Nos: 169 and 170; 171 and 172; 173 and 174; 175 
and 176; 177 and 178; 179 and 180; and 181 and 182, respectively. XP1 was found to 

30 overlap with the DNA sequences for TbH4, disclosed above. The full-length DNA sequence 
for TbH4-XPl is provided in SEQ ID No.: 183. This DNA sequence was found to contain an 
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open reading frame encoding the amino acid sequence shown in SEQ ID No: 184. The 
reverse complement of TbH4-XPl was found to contain an open reading frame encoding the 
amino acid sequence shown in SEQ ID No.: 185. The DNA sequence for XP36 was found to 
contain two open reading frames encoding the amino acid sequence shown in SEQ ID Nos.: 
186 and 187, with the reverse complement containing an open reading frame encoding the 
amino acid sequence shown in SEQ ID No.: 1 88. 

Recombinant XP1 protein was prepared as described above in Example 3B, 
with a metal ion affinity chromatography column being employed for purification. As 
illustrated in Figures 8A-B and 9A-B, using the assays described herein, recombinant XP1 
was found to stimulate cell proliferation and IFN-y production in T cells isolated from an M. 
tuberculosis-immune donors. 

n. Preparation of M. tuberculosis Soluble Antigens us ing Rabbit Anti-sera 

RAISED AGAINST M. TUBEWCl 11 .OS1S FR ACTIONATED PROTEINS 

M tuberculosis lysate was prepared as described above in Example 2. The 
resulting material was fractionated by HPLC and the fractions screened by Western blot for 
serological activity with a serum pool from M. tuberculosis-infected patients which showed 
little or no immunoreactivity with other antigens of the present invention. Rabbit anti-sera 
was generated against the most reactive fraction using the method described in Example 3A . 
The anti-sera was used to screen an M. tuberculosis Erdman strain genomic DNA expression 
library prepared as described above. Bacteriophage plaques expressing immunoreactive 
antigens were purified. Phagemid from the plaques was rescued and the nucleotide sequences 
of the M. tuberculosis clones determined. 

Ten different clones were purified. Of these, one was found to be TbRa35, 
described above, and one was found to be the previously identified M. tuberculosis antigen, 
HSP60. Of the remaining eight clones, seven (hereinafter referred to as RDIF2, RDIF5, 
RDIF8, RDIF10, RDIF11 and RDIF 12) were found to bear some similarity to previously 
identified M. tuberculosis sequences. The determined DNA sequences for RDIF2, RDIF5, 
RDIF8, RDIF10 and RDIF11 are provided in SEQ ID Nos.: 189-193, respectively, with the 
corresponding predicted amino acid sequences being provided in SEQ ID Nos: 194-198, 
respectively. The 5' and 3' DNA sequences for RDIF12 are provided in SEQ ID Nos.: 199 
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and 200, respectively. No significant homologies were found to the antigen RDIF-7. The 
determined DNA and predicted amino acid sequences for RDIF7 are provided in SEQ ID 
Nos.: 201 and 202, respectively. One additional clone, referred to as RDIF6 was isolated, 
however, this was found to be identical to RDIF5. 
5 Recombinant RDIF6, RDIF8, RDIF10 and RDIF11 were prepared as 

described above. As shown in Figures 8A-B and 9A-B, these antigens were found to 
stimulate cell proliferation and IFN-y production in T cells isolated from M tuberculosis- 
immune donors. 



10 EXAMPLE 4 

Purification and Characterization of a Polypeptide from Tuberculin Purified 

Protein Derivative 



An M. tuberculosis polypeptide was isolated from tuberculin purified protein 
1 5 derivative (PPD) as follows. 

PPD was prepared as published with some modification (Seibert, F. et al., 
Tuberculin purified protein derivative. Preparation and analyses of a large quantity for 
standard. The American Review of Tuberculosis 44 :9-25, 1941). 

M. tuberculosis Rv strain was grown for 6 weeks in synthetic medium in roller 
20 bottles at 37°C. Bottles containing the bacterial growth were then heated to 100° C in water 
vapor for 3 hours. Cultures were sterile filtered using a 0.22 \x filter and the liquid phase was 
concentrated 20 times using a 3 kD cut-off membrane. Proteins were precipitated once with 
50% ammonium sulfate solution and eight times with 25% ammonium sulfate solution. The 
resulting proteins (PPD) were fractionated by reverse phase liquid chromatography (RP- 
25 HPLC) using a CI 8 column (7.8 x 300 mM; Waters, Milford, MA) in a Biocad HPLC system 
(Perseptive Biosystems, Framingham, MA). Fractions were eluted from the column with a 
linear gradient from 0-100% buffer (0.1% TFA in acetonitrile). The flow rate was 10 
ml/minute and eluent was monitored at 214 nm and 280 nm. 

Six fractions were collected, dried, suspended in PBS and tested individually 
30 in M. tuberculosis-infected guinea pigs for induction of delayed type hypersensitivity (DTH) 
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reaction. One fraction was found to induce a strong DTH reaction and was subsequently 
fractionated further by RP-HPLC on a microbore Vydac CI 8 column (Cat. No. 218TP5115) 
in a Perkin Elmer/Applied Biosystems Division Model 172 HPLC. Fractions were eluted 
with a linear gradient from 5-100% buffer (0.05% TFA in acetonitrile) with a flow rate of 80 
5 nl/minute. Eluent was monitored at 215 nm. Eight fractions were collected and tested for 
induction of DTH in M. tuberculosis-infected guinea pigs. One fraction was found to induce 
strong DTH of about 16 mm induration. The other fractions did not induce detectable DTH. 
The positive fraction was submitted to SDS-PAGE gel electrophoresis and found to contain a 
single protein band of approximately 12 kD molecular weight. 
10 This polypeptide, herein after referred to as DPPD, was sequenced from the 

amino terminal using a Perkin Elmer/Applied Biosystems Division Precise 492 protein 
sequencer as described above and found to have the N-terminal sequence shown in SEQ ID 
No.: 129. Comparison of this sequence with known sequences in the gene bank as described 
above revealed no known homologies. Four cyanogen bromide fragments of DPPD were 
1 5 isolated and found to have the sequences shown in SEQ ID Nos. : 1 30- 1 3 3 . 

The ability of the antigen DPPD to stimulate human PBMC to proliferate and 
to produce IFN-y was assayed as described in Example 1 . As shown in Table 9, DPPD was 
found to stimulate proliferation and elicit production of large quantities of IFN-y; more than 
that elicited by commercial PPD. 

20 
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TABLE 9 

Results of Proliferation and Interferon^ Assays to DPPD 



PBMC Donor 


Stimulator 


Proliferation (CPM) 


IFN-y (OD< S0 ) 


A 


Medium 


1,089 


0.17 




PPD (commercial) 


8,394 


1.29 




DPPD 


13,451 


2.21 










B 


Medium 


450 


0.09 




PPD (commercial) 


3,929 


1.26 




DPPD 


6,184 


1.49 










C 


Medium 


541 


0.11 




PPD (commercial) 


8,907 


0.76 




DPPD 


23,024 


>2.70 



5 

EXAMPLE 5 

USE OF REPRESENTATIVE ANTIGENS FOR DIAGNOSIS OF TUBERCULOSIS 

This example . illustrates the effectiveness of several representative 
1 0 polypeptides in skin tests for the diagnosis of M tuberculosis infection. 

Individuals were injected intradermally with 100 \x\ of either PBS or PBS plus 
Tween 20™ containing either 0.1 |ag of protein (for TbH-9 and TbRa35) or 1.0 \ig of protein 
(for TbRa38-l). Induration was measured between 5-7 days after injection, with a response 
of 5 mm or greater being considered positive. Of the 20 individuals tested, 2 were PPD 
15 negative and 18 were PPD positive. Of the PPD positive individuals, 3 had active 
tuberculosis, 3 had been previously infected with tuberculosis and 9 were healthy. In a 
second study, 13 PPD positive individuals were tested with 0.1 ^ig TbRal 1 in either PBS or 
PBS plus Tween 20™ as described above. The results of both studies are shown in Table 1 0. 
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TABLE 10 

RESULTS OF DTH TESTING WITH REPRESENTATIVE ANTIGENS 





TbH-9 
Pos/Total 


Tb38-1 
Pos/Total 


TbRa35 
Pos/Total 


Cumulative 
Pos/Total 


TbRall 
Pos/Total 


PPD negative 


0/2 


0/2 


0/2 


0/2 
















PPD positive 












healthy 


5/9 


4/9 


4/9 


6/9 


1/4 


prior TB 


3/5 


2/5 


2/5 


4/5 


3/5 


active 


3/4 


3/4 


0/4 


4/4 


1/4 


TOTAL 


11/18 


9/18 


6/18 


14/18 


5/13 



5 



EXAMPLE 6 
Synthesis of Synthetic Polypeptides 

10 Polypeptides may be synthesized on a Millipore 9050 peptide synthesizer 

using FMOC chemistry with HPTU (0-Benzotriazole-N,N,N' s N'-tetramethyluronium 
hexafluorophosphate) activation. A Gly-Cys-Gly sequence may be attached to the amino 
terminus of the peptide to provide a method of conjugation or labeling of the peptide. 
Cleavage of the peptides from the solid support may be carried out using the following 

15 cleavage mixture: trifluoroacetic acid:ethanedithiol:thioanisole:water:phenol (40:1:2:2:3). 
After cleaving for 2 hours, the peptides may be precipitated in cold methyl-t-butyl-ether. The 
peptide pellets may then be dissolved in water containing 0.1% trifluoroacetic acid (TFA) and 
lyophilized prior to purification by C18 reverse phase HPLC. A gradient of 0%-60% 
acetonitrile (containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the 

20 peptides. Following lyophilization of the pure fractions, the peptides may be characterized 
using electrospray mass spectrometry and by amino acid analysis. 
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EXAN4PLE 7 

Preparation and Characterization of A£ Tuberculosis Fusion Proteins 

A fusion protein containing TbRa3, the 38 kD antigen and Tb38-1 was 
5 prepared as follows. 

Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PCR 
in order to facilitate their fusion and the subsequent expression of the fusion protein TbRa3- 
38 kD-Tb38-l. TbRa3, 38 kD and Tb38-1 DNA was used to perform PCR using the primers 
PDM-64 and PDM-65 (SEQ ID NO: 146 and 147), PDM-57 and PDM-58 (SEQ ID NO: 148 

10 and 149), and PDM-69 and PDM-60 (SEQ ID NO: 150 and 151), respectively. In each case, 
the DNA amplification was performed using 10 ^1 10X Pfu buffer, 2 \i\ 10 mM dNTPs, 2 
each of the PCR primers at 10 jaM concentration, 81.5 \x\ water, 1.5 jil Pfu DNA polymerase 
(Stratagene, La Jolla, CA) and 1 jal DNA at either 70 ng/|j,l (for TbRa3) or 50 ng/|al (for 38 
kD and Tb38-1). For TbRa3, denaturation at 94°C was performed for 2 min, followed by 40 

15 cycles of 96°C for 15 sec and 72°C for 1 min, and lastly by 72°C for 4 min. For 38 kD, 
denaturation at 96°C was performed for 2 min, followed by 40 cycles of 96°C for 30 sec, 
68°C for 15 sec and 72°C for 3 min, and finally by 72°C for 4 min. For Tb38-1 denaturation 
at 94°C for 2 min was followed by 1 0 cycles of 96°C for 1 5 sec, 68°C for 1 5 sec and 72°C for 
1.5 min, 30 cycles of 96°C for 15 sec, 64°C for 15 sec and 72°C for 1.5, and finally by 72°C 

20 for 4 min. 

The TbRa3 PCR fragment was digested with Ndel and EcoRI and cloned 
directly into pT7 A L2 IL 1 vector using Ndel and EcoRI sites. The 38 kD PCR fragment was 
digested with Sse8387I, treated with T4 DNA polymerase to make blunt ends and then 
digested with EcoRI for direct cloning into the pT7 A L2Ra3-l vector which was digested with 

25 StuI and EcoRI. The 38-1 PCR fragment was digested with Eco47III and EcoRI and directly 
subcloned into pT7 A L2Ra3/38kD-17 digested with the same enzymes. The whole fusion was 
then transferred to pET28b - using Ndel and EcoRI sites. The fusion construct was 
confirmed by DNA sequencing. 

The expression construct was transformed into BLR pLys S E. coli (Novagen, 

30 Madison, WI) and grown overnight in LB broth with kanamycin (30 fig/ml) and 
chloramphenicol (34 ng/ml). This culture (12 ml) was used to inoculate 500 ml 2XYT with 
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the same antibiotics and the culture was induced with IPTG at an OD560 of 0.44 to a final 
concentration of 1.2 mM. Four hours post-induction, the bacteria were harvested and 
sonicated in 20 mM Tris (8.0), 100 mM NaCl, 0.1% DOC, 20 ufi/ml Leupeptin, 20 mM 
PMSF followed by centrifugation at 26,000 X g. The resulting pellet was resuspended in 8 M 
5 urea, 20 mM Tris (8.0), 100 mM NaCl and bound to Pro-bond nickel resin (Invitrogen, 
Carlsbad, CA). The column was washed several times with the above buffer then eluted with 
an imidazole gradient (50 mM, 100 mM, 500 mM imidazole was added to 8 M urea, 20 mM 
Tris (8.0), 100 mM NaCl). The eluates containing the protein of interest were then dialzyed 

against 1 0 mM Tris (8.0). 
10 The DNA and amino acid sequences for the resulting fusion protein 

(hereinafter referred to as TbRa3-38 kD-Tb38-l) are provided in SEQ ID NO: 152 and 153, 
respectively. 

A fusion protein containing the two antigens TbH-9 and Tb38-1 (hereinafter 
referred to as TbH9-Tb38-l) without a hinge sequence, was prepared using a similar 
1 5 procedure to that described above. The DNA sequence for the TbH9-Tb38-l fusion protein is 

provided in SEQ ID NO: 156. 

The ability of the fusion protein TbH9-Tb38-l to induce T cell proliferation 
and IFN-y production in PBMC preparations was examined using the protocol described 
above in Example 1 . PBMC from three donors were employed: one who had been previously 
20 shown to respond to TbH9 but not Tb38-1 (donor 131); one who had been shown to respond 
to Tb38-1 but not TbH9 (donor 184); and one who had been shown to respond to both 
antigens (donor 201). The results of these studies (Figs. 5-7, respectively) demonstrate the 
functional activity of both the antigens in the fusion protein. 

A fusion protein containing TbRa3, the antigen 38kD, Tb38-1 and DPEP was 

25 prepared as follows. 

Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PCR 
and cloned into vectors essentially as described above, with the primers PDM-69 (SEQ ID 
NO:150 and PDM-83 (SEQ ID NO: 205) being used for amplification of the Tb38-1A 
fragment. Tb38-1A differs from Tb38-1 by a Dral site at the 3' end of the coding region that 
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keeps the final amino acid intact while creating a blunt restriction site that is in frame. The 
TbRa3/38kD/Tb38-l A fusion was then transferred to pET28b using Ndel and EcoRl sites. 

DPEP DNA was used to perform PCR using the primers PDM-84 and PDM- 
85 (SEQ ID NO: 206 and 207, respectively) and 1 jal DNA at 50 ng/\xl Denaturation at 94 °C 
5 was performed for 2 min, followed by 1 0 cycles of 96 °C for 15 sec, 68 °C for 15 sec and 72 
°C for 1.5 min; 30 cycles of 96 °C for 15 sec, 64 °C for 15 sec and 72 °C for 1.5 min; and 
finally by 72 °C for 4 min. The DPEP PCR fragment was digested with EcoRI and Eco72I 
and clones directly into the pET28Ra3/38kD/38-l A construct which was digested with Dral 
and EcoRI. The fusion construct was confirmed to be correct by DNA sequencing. 

1 0 Recombinant protein was prepared as described above. The DNA and amino acid sequences 
for the resulting fusion protein (hereinafter referred to as TbF-2) are provided in SEQ ID NO: 
208 and 209, respectively. 

The reactivity of the fusion protein TbF-2 with sera from M. tuberculosis- 
infected patients was examined by ELISA using the protocol described above. The results of 

1 5 these studies (Table 11) demonstrate that all four antigens function independently in the 
fusion protein. 



BNSDOCID: <WO 9816646A2J_> 



WO 98/16646 



PCTAJS97/18293 



58 

Table 1 1 

Reactivity of TbF-2 Fusion Recombinant with TB and Normal Sera 



Serum ID 


Status 


TbF 
OD450 


Status 


TbF-2 
OD450 


Status 




ELISA Re 


;activiry 
















38 kD 


TbRa3 


Tb38-I 


DPEP 


B931-40 


TB 


0.57 


+ 


0.321 


+ 


- 


+ 


- 


+ 


B931-41 


TB 


0.601 


+ 


0.396 


+ 


+ 


+ 




- 


B931-109 


TB 


0.494 


+ 


0.404 


+ 1 


+ 


+ 


± 


- 


B931-132 


TB 


1.502 




1.292 


+ 


+ 


+ 




± 


5004 


TB 


1.806 


+ 


1.666 


+ 


± 


± 


+ 




15004 


TB 


2.862 


+ 


2.468 


+ 




+ 


+ 




39004 


TB 


2.443 


+ 


1.722 




+ 


+ 


+ 




68004 


TB 


2.871 




2.575 


+ 


+ 


+ 






99004 


TB 


0.691 


+ 


0.971 


+ ! 


- 


± 


4 




107004 


TB 


0.875 




0.732 


+ 


- 


± 


+ 


■ 


92004 


TB 


1.632 


+ 


1.394 






± 


± 




97004 


TB 


1.491 


+ 


1.979 


+ 


+ 


± 


- 


+ 


118004 


TB 


3.182 


+ 


3.045 


+ 


+ 


± 


- 




173004 


TB 


3.644 


+ 


3.578 


+ 


+ 


+ 


+ 




175004 


TB 


3.332 




2.916 


+ 




+ 


■ 


- 


274004 


TB 


3.696 


+ 


3.716 




- 


+ 




+ 


276004 


TB 


3.243 




2.56 


+ 


- 


- 




■ 


282004 


TB 


1.249 


+ 


1.234 


+ 


+ 


- 




* 


289004 


TB 


1.373 


+ 


1.17 




- 


+ 


- 




308004 


TB 


3.708 


+ 


3.355 


+ 


- 


- 


+ 


- 


314004 


TB 


1.663 


+ 


1.399 


+ 


- 


- 


+ 


- 


317004 


TB 


1.163 


+ 


0.92 


+ 




- 


- 




312004 


TB 


1.709 


+ 


1.453 


+ 


- 


+ 






380004 


TB 


0.238 


- 


0.461 


+ 


- 


± 


* 


+ 


451004 


TB 


0.18 


- 


0.2 


- 


- 


- 


- 


+ 


478004 


TB 


0.188 


- 


0.469 


+ 


- 


- 


- 


+ 


410004 


TB 


0.384 




2.392 


+ 


± 


- 


- 




411004 


TB 


0.306 




0.874 


+ 


- 






+ 


421004 


TB 


0.357 


+ 


1.456 


+ 








+ 


528004 


TB 


0.047 




U. I vo 












A6-87 


Normal 


0.094 




0.063 












A6-88 


Normal 


0.214 




0.19 












A6-89 


Normal 


0.248 




0.125 












A6-90 


Normal 


0.179 




0.206 












A6-91 


Normal 


0.135 




0.151 












A6-92 


Normal 


0.064 




0.097 












A6-93 


Normal 


0.072 




0.098 












A6-94 


Normal 


0.072 




0.064 












A6-95 


Normal 


0.125 




0.159 












A6-96 


Normal 


0.121 




0.12 
































Cut-off 




0.284 




0.266 
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One of skill in the art will appreciate that the order of the individual antigens 
within the fusion protein may be changed and that comparable activity would be expected 
provided each of the epitopes is still functionally available. In addition, truncated forms of 
the proteins containing active epitopes may be used in the construction of fusion proteins. 

5 

From the foregoing, it will be appreciated that, although specific embodiments 
of the invention have been described herein for the purpose of illustration, various 
modifications may be made without deviating from the spirit and scope of the invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANTS: Reed, Steven G. 

Skeiky, Yasir A.W. 
Dillon, Davin C. 
Campos-Net o, Antonio 
Houghton, Raymond 
Vedvick, Thomas S. 
Twardzik, Daniel R- 
Lodes, Michael J. 

(ii) TITLE OF INVENTION: COMPOUNDS AND METHODS FOR IMMUNOTHERAPY 
AND DIAGNOSIS OF TUBERCULOSIS 

(iii) NUMBER OF SEQUENCES: 214 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center, 701 Fifth Avenue 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY : USA 

(F) ZIP: 98104-7092 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
<B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 01-OCT-1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Maki, David J. 

(B) REGISTRATION NUMBER: 31,392 

(C) REFERENCE/DOCKET NUMBER: 210121. 411C7 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 682-6031 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 766 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
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CGAGGCACCG 


GTAGTTTGAA 


CCAAACGCAC 


AATCGACGGG 


CAAACGAACG 


GAAGAACACA 


60 


ACCATGAAGA 


TGGTGAAATC 


GATCGCCGCA 


GGTCTGACCG 


CCGCGGCTGC 


AATCGGCGCC 


120 


GCTGCGGCCG 


GTGTGACTTC 


GATCATGGCT 


GGCGGCCCGG 


TCGTATACCA 


GATGCAGCCG 


180 


GTCGTCTTCG 


GCGCGCCACT 


GCCGTTGGAC 


CCGGCATCCG 


CCCCTGACGT 


CCCGACCGCC 


240 


GCCCAGTTGA 


CCAGCCTGCT 


CAACAGCCTC 


GCCGATCCCA 


ACGTGTCGTT 


TGCGAACAAG 


300 


GGCAGTCTGG 


TCGAGGGCGG 


CATCGGGGGC 


ACCGAGGCGC 


GCATCGCCGA 


CCACAAGCTG 


360 


AAGAAGGCCG 


CCGAGCACGG 


GGATCTGCCG 


CTGTCGTTCA 


GCGTGACGAA 


CATCCAGCCG 


420 


GCGGCCGCCG 


GTTCGGCCAC 


CGCCGACGTT 


TCCGTCTCGG 


GTCCGAAGCT 


CTCGTCGCCG 


480 


GTCACGCAGA 


ACGTCACGTT 


CGTGAATCAA 


GGCGGCTGGA 


TGCTGTCACG 


CGCATCGGCG 


540 


ATGGAGTTGC 


TGCAGGCCGC 


AGGGNAACTG 


ATTGGCGGGC 


CGGNTTCAGC 


CCGCTGTTCA 


600 


GCTACGCCGC 


CCGCCTGGTG 


ACGCGTCCAT 


GTCGAACACT 


CGCGCGTGTA 


GCACGGTGCG - 


660 


GTNTGCGCAG 


GGNCGCACGC 


ACCGCCCGGT 


GCAAGCCGTC 


CTCGAGATAG 


GTGGTGNCTC 


720 


GNCACCAGNG 


ANCACCCCCN 


NNTCGNCNNT 


TCTCGNTGNT 


GNATGA 




766 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 752 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



ATGCATCACC 


ATCACCATCA 


CGATGAAGTC 


ACGGTAGAGA 


CGACCTCCGT 


CTTCCGCGCA 


60 


GACTTCCTCA 


GCGAGCTGGA 


CGCTCCTGCG 


CAAGCGGGTA 


CGGAGAGCGC 


GGTCTCCGGG 


120 


GTGGAAGGGC 


TCCCGCCGGG 


CTCGGCGTTG 


CTGGTAGTCA 


AACGAGGCCC 


CAACGCCGGG 


180 


TCCCGGTTCC 


TACTCGACCA 


AGCCATCACG 


TCGGCTGGTC 


GGCATCCCGA 


CAGCGACATA 


240 


TTTCTCGACG 


ACGTGACCGT 


GAGCCGTCGC 


CATGCTGAAT 


TCCGGTTGGA 


AAACAACGAA 


300 


TTCAATGTCG 


TCGATGTCGG 


GAGTCTCAAC 


GGCACCTACG 


TCAACCGCGA 


GCCCGTGGAT 


360 


TCGGCGGTGC 


TGGCGAACGG 


CGACGAGGTC 


CAGATCGGCA 


AGCTCCGGTT 


GGTGTTCTTG 


420 


ACCGGACCCA 


AGCAAGGCGA 


GGATGACGGG 


AGTACCGGGG 


GCCCGTGAGC 


GCACCCGATA 


480 


GCCCCGCGCT 


GGCCGGGATG 


TCGATCGGGG 


CGGTCCTCCG 


ACCTGCTACG 


ACCGGATTTT 


540 


CCCTGATGTC 


CACCATCT.CC 


AAGATTCGAT 


TCTTGGGAGG 


CTTGAGGGTC 


NGGGTGACCC 


600 


CCCCGCGGGC 


CTCATTCNGG 


GGTNTCGGCN 


GGTTTCACCC 


CNTACCNACT 


GCCNCCCGGN 


660 
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TTGCNAATTC NTTCTTCNCT GCCCNNAAAG GGACCNTTAN CTTGCCGCTN GAAANGGTNA 7 20 

TCCNGGGCCC NTCCTNGAAN CCCCNTCCCC CT 752 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CATATGCATC ACCATCACCA TCACACTTCT AACCGCCCAG CGCGTCGGGG GCGTCGAGCA 60 
CCACGCGACA CCGGGCCCGA TCGATCTGCT AGCTTGAGTC TGGTCAGGCA TCGTCGTCAG 120 
CAGCGCGATG CCCTATGTTT GTCGTCGACT CAGATATCGC GGCAATCCAA TCTCCCGCCT 
GCGGCCGGCG GTGCTGCAAA CTACTCCCGG AGGAATTTCG ACGTGCGCAT CAAGATCTTC 
ATGCTGGTCA CGGCTGTCGT TTTGCTCTGT TGTTCGGGTG TGGCCACGGC CGCGCCCAAG 
ACCTACTGCG AGGAGTTGAA AGGCACCGAT ACCGGCCAGG CGTGCCAGAT TCAAATGTCC 3 60 

GACCCGGCCT AC AAC AT C AA CATCAGCCTG CCCAGTTACT ACCCCGACCA GAAGTCGCTG 
G AAAAT T AC A TCGCCCAGAC GCGCGACAAG TTCCTCAGCG CGGCCACATC GTCCACTCCA 
CGCGAAGCCC CCTACGAATT GAATATCACC TCGGCCACAT ACCAGTCCGC GATACCGCCG 
CGTGGTACGC AGGCCGTGGT GCTCAMGGTC TACCACAACG CCGGCGGCAC GCACCCAACG 
ACCACGTACA AGGCCTTCGA TTGGGACCAG GCCTATCGCA AGCCAATCAC CTATGACACG 660 
CTGTGGCAGG CTGACACCGA TCCGCTGCCA GTCGTCTTCC CCATTGTTGC AAGGTGAACT 720 
GAGCAACGCA GACCGGGACA ACWGGTATCG ATAGCCGCCN AATGCCGGCT TGGAACCCNG 78 0 

TGAAATTATC ACAACTTCGC AG T C ACN AAA NAA 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



180 
240 
300 



420 
480 
540 
600 



813 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
CGGTATGAAC ACGGCCGCGT CCGATAACTT CCAGCTGTCC CAGGGTGGGC AGGGATTCGC 
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CATTCCGATC 


GGGCAGGCGA 


TGGCGATCGC 


GGGCCAGATC 


CGATCGGGTG 


GGGGGTCACC 


120 


CACCGTTCAT 


ATCGGGCCTA 


CCGCCTTCCT 


CGGCTTGGGT 


GTTGTCGACA 


ACAACGGCAA 


180 


CGGCGCACGA 


GTCCAACGCG 


TGGTCGGGAG 


CGCTCCGGCG 


GCAAGTCTCG 


GCATCTCCAC 


240 


CGGCGACGTG 


ATCACCGCGG 


TCGACGGCGC 


TCCGATCAAC 


TCGGCCACCG 


CGATGGCGGA 


300 


CGCGCTTAAC 


GGGCATCATC 


CCGGTGACGT 


CATCTCGGTG 


AACTGGCAAA 


CCAAGTCGGG 


360 


CGGCACGCGT 


ACAGGGAACG 


TGACATTGGC 


CGAGGGACCC 


CCGGCCTGAT 


TTCGTCGYGG 


420 


ATACCACCCG 


CCGGCCGGCC 


AATTGGA 








447 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



GTCCCACTGC 


GGTCGCCGAG 


TATGTCGCCC 


AGCAAATGTC 


TGGCAGCCGC 


CCAACGGAAT 


60 


CCGGTGATCC 


GACGTCGCAG 


GTTGTCGAAC 


CCGCCGCCGC 


GGAAGTATCG 


GTCCATGCCT 


120 


AGCCCGGCGA 


CGGCGAGCGC 


CGGAATGGCG 


CGAGTGAGGA 


GGCGGGCAAT 


TTGGCGGGGC 


180 


CCGGCGACGG 


NGAGCGCCGG 


AATGGCGCGA 


GTGAGGAGGT 


GGNCAGTCAT 


GCCCAGNGTG 


240 


ATCCAATCAA 


CCTGNATTCG 


GNCTGNGGGN 


CCATTTGACA 


ATCGAGGTAG 


TGAGCGCAAA 


300 


TGAATGATGG 


AAAACGGGNG 


GNGACGTCCG 


NTGTTCTGGT 


GGTGNTAGGT 


GNCTGNCTGG 


360 


NGTNGNGGNT 


ATCAGGATGT 


TCTTCGNCGA 


AANCTGATGN 


CGAGGAACAG 


GGTGTNCCCG 


420 


NNANNCCNAN 


GGNGTCCNAN 


CCCNNNNTCC 


TCGNCGANAT 


CANANAGNCG 


NTTGATGNGA 


480 


NAAAAGGGTG 


GANCAGNNNN 


AANTNGNGGN 


CCNAANAANC 


NNNANNGNNG 


NNAGNTNGNT 


540 


NNNTNTTNNC 


ANNNNNNNTG 


NNGNNGNNCN 


NNNCAANCNN 


NTNNNNGNAA 


NNGGNTTNTT 


600 


NAAT 












604 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

TTGCANGTCG AACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCG CGGTGGCGGC 60 

CGCTCTAGAA CTAGTGKATM YYYCKGGCTG CAGSAATYCG GYACGAGCAT TAGGACAGTC 120 

TAACGGTCCT GTTACGGTGA TCGAATGACC GACGACATCC TGCTGATCGA CACCGACGAA 18 0 

CGGGTGCGAA CCCTCACCCT CAACCGGCCG CAGTCCCGYA ACGCGCTCTC GGCGGCGCTA 24 0 

CGGGATCGGT TTTTCGCGGY GTTGGYCGAC GCCGAGGYCG ACGACGACAT CGACGTCGTC 300 

ATCCTCACCG GYGCCGATCC GGTGTTCTGC GCCGGACTGG ACCTCAAGGT AGCTGGCCGG 360 

GCAGACCGCG CTGCCGGACA TCTCACCGCG GTGGGCGGCC ATGACCAAGC CGGTGATCGG 4 20 

CGCGATCAAC GGCGCCGCGG TCACCGGCGG GCTCGAACTG GCGCTGTACT GCGACATCCT 4 80 

GATCGCCTCC GAGCACGCCC GCTTCGNCGA CACCCACGCC CGGGTGGGGC TGCTGCCCAC 54 0 

CTGGGGACTC AGTGTGTGCT TGCCGCAAAA GGTCGGCATC GGNCTGGGCC GGTGGATGAG 600 
CCTGACCGGC GACTACCTGT CCGTGACCGA CGC 
(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1362 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



633 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



CGACGACGAC 


GGCGCCGGAG 


AGCGGGCGCG 


AACGGCGATC 


GACGCGGCCC 


TGGCCAGAGT 


60 


CGGCACCACC 


CAGGAGGGAG 


TCGAATCATG 


AAATTTGTCA 


ACCATATTGA 


GCCCGTCGCG 


120 


CCCCGCCGAG 


CCGGCGGCGC 


GGTCGCCGAG 


GTCTATGCCG 


AGGCCCGCCG 


CGAGTTCGGC 


180 


CGGCTGCCCG 


AGCCGCTCGC 


CATGCTGTCC 


CCGGACGAGG 


GACTGCTCAC 


CGCCGGCTGG 


240 


GCGACGTTGC 


GCG AG ACACT 


GCTGGTGGGC 


CAGGTGCCGC 


GTGGCCGCAA 


GGAAGCCGTC 


300 


GCCGCCGCCG 


TCGCGGCCAG 


CCTGCGCTGC 


CCCTGGTGCG 


TCGACGCACA 


CACCACCATG 


360 


CTGTACGCGG 


CAGGCCAAAC 


CGACACCGCC 


GCGGCGATCT 


TGGCCGGCAC 


AGCACCTGCC 


420 


GCCGGTGACC 


CGAACGCGCC 


GTATGTGGCG 


TGGGCGGCAG 


GAACCGGGAC 


ACCGGCGGGA 


480 


CCGCCGGCAC 


CGTTCGGCCC 


GGATGTCGCC 


GCCGAATACC 


TGGGCACCGC 


GGTGCAATTC 


540 


CACTTCATCG 


CACGCCTGGT 


CCTGGTGCTG 


CTGGACGAAA 


CCTTCCTGCC 


GGGGGGCCCG 


600 


CGCGCCCAAC 


AGCTCATGCG 


CCGCGCCGGT 


GGACTGGTGT 


TCGCCCGCAA 


GGTGCGCGCG 


660 


GAGCATCGGC 


CGGGCCGCTC 


CACCCGCCGG 


CTCGAGCCGC 


GAACGCTGCC 


CGACGATCTG 


720 
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GCATGGGCAA 


CACCGTCCGA 


GCCCATAGCA 


ACCGCGTTCG 


CCGCGCTCAG 


CCACCACCTG 


780 


GACACCGCGC 


CGCACCTGCC 


GCCACCGACT 


CGTCAGGTGG 


TCAGGCGGGT 


CGTGGGGTCG 


840 


TGGCACGGCG 


AGCCAATGCC 


GATGAGCAGT 


CGCTGGACGA 


ACGAGCACAC 


CGCCGAGCTG 


900 


CCCGCCGACC 


TGCACGCGCC 


CACCCGTCTT 


GCCCTGCTGA 


CCGGCCTGGC 


CCCGCATCAG 


960 


GTGACCGACG 


ACGACGTCGC 


CGCGGCCCGA 


TCCCTGCTCG 


ACACCGATGC 


GGCGCTGGTT 


1020 


GGCGCCCTGG 


CCTGGGCCGC 


CTTCACCGCC 


GCGCGGCGCA 


TCGGCACCTG 


GATCGGCGCC 


1080 


GCCGCCGAGG 


GCCAGGTGTC 


GCGGCAAAAC 


CCGACTGGGT 


GAGTGTGCGC 


GCCCTGTCGG 


1140 


TAGGGTGTCA 


TCGCTGGCCC 


GAGGGATCTC 


GCGGCGGCGA 


ACGGAGGTGG 


CGACACAGGT 


1200 


GGAAGCTGCG 


CCCACTGGCT 


TGCGCCCCAA 


CGCCGTCGTG 


GGCGTTCGGT 


TGGCCGCACT 


1260 


GGCCGATCAG 


GTCGGCGCCG 


GCCCTTGGCC 


GAAGGTCCAG 


CTCAACGTGC 


CGTCACCGAA 


1320 


GGACCGGACG 


GTCACCGGGG 


GTCACCCTGC 


GCGCCCAAGG 


AA 




1362 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

GCGACGACCC CGATATGCCG GGCACCGTAG CGAAAGCCGT CGCCGACGCA CTCGGGCGCG 60 

GTATCGCTCC CGTTGAGGAC ATTCAGGACT GCGTGGAGGC CCGGCTGGGG GAAGCCGGTC 12 0 

TGGATGACGT GGCCCGTGTT TACATCATCT ACCGGCAGCG GCGCGCCGAG CTGCGGACGG 18 0 

CTAAGGCCTT GCTCGGCGTG CGGGACGAGT TAAAGCTGAG CTTGGCGGCC GTGACGGTAC 24 0 

TGCGCGAGCG CTATCTGCTG CACGACGAGC AGGGCCGGCC GGCCGAGTCG ACCGGCGAGC 300 

TGATGGACCG ATCGGCGCGC TGTGTCGCGG CGGCCGAGGA CCAGTATGAG CCGGGCTCGT 360 

CGAGGCGGTG GGCCGAGCGG TTCGCCACGC TATTACGCAA CCTGGAATTC CTGCCGAATT 4 20 

CGCCCACGTT GATGAACTCT GGCACCGACC TGGGACTGCT CGCCGGCTGT TTTGTTCTGC 4 80 

CGATTGAGGA TTCGCTGCAA TCGATCTTTG CGACGCTGGG ACAGGCCGCC GAGCTGCAGC 54 0 

GGGCTGGAGG CGGCACCGGA TATGCGTTCA GCCACCTGCG ACCCGCCGGG GATCGGGTGG 600 

CCTCCACGGG CGGCACGGCC AGCGGACCGG TGTCGTTTCT ACGGCTGTAT GACAGTGCCG 660 

CGGGTGTGGT CTCCATGGGC GGTCGCCGGC GTGGCGCCTG TATGGCTGTG CTTGATGTGT 720 

CGCACCCGGA TATCTGTGAT TTCGTCACCG CCAAGGCCGA ATCCCCCAGC GAGCTCCCGC 7 80 
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ATTTCAACCT 


ATCGGTTGGT 


GTGACCGACG 


CGTTCCTGCG 


GGCCGTCGAA 


CGCAACGGCC 


840 


TACACCGGCT 


GGTCAATCCG 


CGAACCGGCA 


AGATCGTCGC 


GCGGATGCCC 


GCCGCCGAGC 


900 


TGTTCGACGC 


CATCTGCAAA 


GCCGCGCACG 


CCGGTGGCGA 


TCCCGGGCTG 


GTGTTTCTCG 


960 


AC AC GAT C AA 


TAGGGCAAAC 


CCGGTGCCGG 


GGAGAGGCCG 


CATCGAGGCG 


ACCAACCCGT 


1020 


GCGGGGAGGT 


CCCACTGCTG 


CCTTACGAGT 


CATGTAATCT 


CGGCTCGATC 


AACCTCGCCC 


1080 


GGATGCTCGC 


CGACGGTCGC 


GTCGACTGGG 


ACCGGCTCGA 


GGAGGTCGCC 


GGTGTGGCGG 


1140 


TGCGGTTCCT 


TGATGACGTC 


ATCGATGTCA 


GCCGCTACCC 


CTTCCCCGAA 


CTGGGTGAGG 


1200 


CGGCCCGCGC 


CACCCGCAAG 


ATCGGGCTGG 


GAGTCATGGG 


TTTGGCGGAA 


CTGCTTGCCG 


1260 


CACTGGGTAT 


TCCGTACGAC 


AGTGAAGAAG 


CCGTGCGGTT 


AGCCACCCGG 


CTCATGCGTC 


1320 


GCATACAGCA 


GGCGGCGCAC 


ACGGCATCGC 


GGAGGCTGGC 


CGAAGAGCGG 


GGCGCATTCC 


1380 


CGGCGTTCAC 


CGATAGCCGG 


TTCGCGCGGT 


CGGGCCCGAG 


GCGCAACGCA 


CAGGTCACCT 


1440 


CCGTCGCTCC 


GACGGGCA 










1458 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



ACGGTGTAAT 


CGTGCTGGAT 


CTGGAACCGC 


GTGGCCCGCT 


ACCTACCGAG 


ATCTACTGGC 


60 


GGCGCAGGGG 


GCTGGCCCTG 


GGCATCGCGG 


TCGTCGTAGT 


CGGGATCGCG 


GTGGCCATCG 


120 


TCATCGCCTT 


CGTCGACAGC 


AGCGCCGGTG 


CCAAACCGGT 


CAGCGCCGAC 


AAGCCGGCCT 


180 


CCGCCCAGAG 


CCATCCGGGC 


TCGCCGGCAC 


CCCAAGCACC 


CCAGCCGGCC 


GGGCAAACCG 


240 


AAGGTAACGC 


CGCCGCGGCC 


CCGCCGCAGG 


GCCAAAACCC 


CGAGACACCC 


ACGCCCACCG 


300 


CCGCGGTGCA 


GCCGCCGCCG 


GTGCTCAAGG 


AAGGGGACGA 


TTGCCCCGAT 


TCGACGCTGG 


360 


CCGTCAAAGG 


TTTGACCAAC 


GCGCCGCAGT 


ACTACGTCGG 


CGACCAGCCG 


AAGTTCACCA 


420 


TGGTGGTCAC 


CAACATCGGC 


CTGGTGTCCT 


GTAAACGCGA 


CGTTGGGGCC 


GCGGTGTTGG 


480 


CCGCCTACGT 


TTACTCGCTG 


GACAACAAGC 


GGTTGTGGTC 


CAACCTGGAC 


TGCGCGCCCT 


540 


CGAATGAGAC 


GCTGGTCAAG 


ACGTTTTCCC 


CCGGTGAGCA 


GGTAACGACC 


GCGGTGACCT 


600 


GGACCGGGAT 


GGGATCGGCG 


CCGCGCTGCC 


CATTGCCGCG 


GCCGGCGATC 


GGGCCGGGCA 


660 


CCTACAATCT 


CGTGGTACAA 


CTGGGCAATC 


TGCGCTCGCT 


GCCGGTTCCG 


TTCATCCTGA 


720 
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ATCAGCCGCC GCCGCCGCCC GGGCCGGTAC CCGCTCCGGG TCCAGCGCAG GCGCCTCCGC 7 80 

CGGAGTCTCC CGCGCAAGGC GGATAATTAT TGATCGCTGA TGGTCGATTC CGCCAGCTGT 84 0 

GACAACCCCT CGCCTCGTGC CG 8 62 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TTGATCAGCA 


CCGGCAAGGC 


GTCACATGCC 


TCCCTGGGTG 


TGCAGGTGAC 


CAATGACAAA 


60 


GACACCCCGG 


GCGCCAAGAT 


CGTCGAAGTA 


GTGGCCGGTG 


GTGCTGCCGC 


GAACGCTGGA 


120 


GTGCCGAAGG 


GCGTCGTTGT 


CACCAAGGTC 


GACGACCGCC 


CGATCAACAG 


CGCGGACGCG 


180 


TTGGTTGCCG 


CCGTGCGGTC 


CAAAGCGCCG 


GGCGCCACGG 


TGGCGCTAAC 


CTTTCAGGAT 


240 


CCCTCGGGCG 


GTAGCCGCAC 


AGTGCAAGTC 


ACCCTCGGCA 


AGGCGGAGCA 


GTGATGAAGG 


300 


TCGCCGCGCA 


GTGTTCAAAG 


CTCGGATATA 


CGGTGGCACC 


CATGGAACAG 


CGTGCGGAGT 


360 


TGGTGGTTGG 


CCGGGCACTT 


GTCGTCGTCG 


TTGACGATCG 


CACGGCGCAC 


GGCGATGAAG 


420 


ACCACAGCGG 


GCCGCTTGTC 


ACCGAGCTGC 


TCACCGAGGC 


CGGGTTTGTT 


GTCGACGGCG 


480 


TGGTGGCGGT 


GTCGGCCGAC 


GAGGTCGAGA 


TCCGAAATGC 


GCTGAACACA 


GCGGTGATCG 


540 


GCGGGGTGGA 


CCTGGTGGTG 


TCGGTCGGCG 


GGACCGGNGT 


GACGNCTCGC 


GATGTCACCC 


600 


CGGAAGCCAC 


CCGNGACATT 


CT 








622 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GGCGCAGCGG TAAGCCTGTT GGCCGCCGGC ACACTGGTGT TGACAGCATG CGGCGGTGGC 60 

ACCAACAGCT CGTCGTCAGG CGCAGGCGGA ACGTCTGGGT CGGTGCACTG CGGCGGCAAG 120 

AAGGAGCTCC ACTCCAGCGG CTCGACCGCA CAAGAAAATG CCATGGAGCA GTTCGTCTAT 180 
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GCCTACGTGC GATCGTGCCC GGGCTACACG TTGGACTACA ACGCCAACGG GTCCGGTGCC 24 0 

GGGGTGACCC AGTTTCTCAA CAACGAAACC GATTTCGCCG GCTCGGATGT CCCGTTGAAT 300 

CCGTCGACCG GTCAACCTGA CCGGTCGGCG GAGCGGTGCG GTTCCCCGGC ATGGGACCTG 360 

CCGACGGTGT TCGGCCCGAT CGCGATCACC TACAATATCA AGGGCGTGAG CACGCTGAAT 4 20 

CTTGACGGAC CCACTACCGC CAAGATTTTC AACGGCACCA TCACCGTGTG GAATGATCCA 4 80 

CAGATCCAAG CCCTCAACTC CGGCACCGAC CTGCCGCCAA CACCGATTAG CGTTATCTTC 54 0 

CGCAGCGACA AGTCCGGTAC GTCGGACAAC TTCCAGAAAT ACCTCGACGG TGTATCCAAC 600 

GGGGCGTGGG GCAAAGGCGC CAGCGAAACG TTCAGCGGGG GCGTCGGCGT CGGCGCCAGC 660 

GGGAACAACG GAACGTCGGC. CCTACTGCAG ACGACCGACG GGTCGATCAC CTACAACGAG 720 

TGGTCGTTTG CGGTGGGTAA GCAGTTGAAC ATGGCCCAGA TCATCACGTC GGCGGGTCCG 780 

GATCCAGTGG CGATCACCAC CGAGTCGGTC GGTAAGACAA TCGCCGGGGC CAAGATCATG 84 0 

GGACAAGGCA ACGACCTGGT ATTGGACACG TCGTCGTTCT ACAGACCCAC CCAGCCTGGC 900 

TCTTACCCGA TCGTGCTGGC GACCTATGAG ATCGTCTGCT CGAAATACCC GGATGCGACG 960 

ACCGGTACTG CGGTAAGGGC GTTTATGCAA GCCGCGATTG GTCCAGGCCA AGAAGGCCTG 1020 

GACCAATACG GCTCCATTCC GTTGCCCAAA TCGTTCCAAG CAAAATTGGC GGCCGCGGTG 1080 

AATGCTATTT CTTGACCTAG TGAAGGGAAT TCGACGGTGA GCGATGCCGT TCCGCAGGTA 114 0 

GGGTCGCAAT TTGGGCCGTA TCAGCTATTG CGGCTGCTGG GCCGAGGCGG GATGGGCGAG 1200 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GCAAGCAGCT GCAGGTCGTG CTGTTCGACG AACTGGGCAT GCCGAAGACC AAACGCACCA 
AGACCGGCTA CACCACGGAT GCCGACGCGC TGCAGTCGTT GTTCGACAAG ACCGGGCATC 
CGTTTCTGCA ACATCTGCTC GCCCACCGCG ACGTCACCCG GCTCAAGGTC ACCGTCGACG 
GGTTGCTCCA AGCGGTGGCC GCCGACGGCC GCATCCACAC CACGTTCAAC CAGACGATCG 
CCGCGACCGG CCGGCTCTCC TCGACCGAAC CCAACCTGCA GAACATCCCG ATCCGCACCG 
ACGCGGGCCG GCGGATCCGG GACGCGTTCG TGGTCGGGGA CGGTTACGCC GAGTTGATGA 
CGGCCGACTA CAGCCAGATC GAGATGCGGA TCATGGGGCA CCTGTCCGGG GACGAGGGCC 
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TCATCGAGGC GTTCAACACC GGGGAGGACC TGTATTCGTT CGTCGCGTCC CGGGTGTTCG 4 80 

GTGTGCCCAT CGACGAGGTC ACCGGCGAGT TGCGGCGCCG GGTCAAGGCG ATGTCCTACG 54 0 

GGCTGGTTTA CGGGTTGAGC GCCTACGGCC TGTCGCAGCA GTTGAAAATC TCCACCGAGG 600 

AAGCCAACGA GCAGATGGAC GCGTATTTCG CCCGATTCGG CGGGGTGCGC GACTACCTGC 660 

GCGCCGTAGT CGAGCGGGCC CGCAAGGACG GCTACACCTC GACGGTGCTG GGCCGTCGCC 720 

GCTACCTGCC CGAGCTGGAC AGCAGCAACC GTCAAGTGCG GGAGGCCGCC GAGCGGGCGG 7 80 

CGCTGAACGC GCCGATCCAG GGCAGCGCGG CCGACATCAT CAAGGTGGCC ATGATCCAGG 84 0 

TCGACAAGGC GCTCAACGAG GCACAGCTGG CGTCGCGCAT GCTGCTGCAG GTCCACGACG 900 

AGCTGCTGTT CGAAATCGCC CCCGGTGAAC GCGAGCGGGT CGAGGCCCTG GTGCGCGACA 960 

AGATGGGCGG CGCTTACCCG CTCGACGTCC CGCTGGAGGT GTCGGTGGGC TACGGCCGCA 1020 

GCTGGGACGC GGCGGCGCAC TGAGTGCCGA GCGTGCATCT GGGGCGGGAA TTCGGCGATT 1080 

TTTCCGCCCT GAGTTCACGC TCGGCGCAAT CGGGACCGAG TTTGTCCAGC GTGTACCCGT 114 0 

CGAGTAGCCT CGTCA 1155 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1771 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



GAGCGCCGTC 


TGGTGTTTGA 


ACGGTTTTAC 


CGGTCGGCAT 


CGGCACGGGC 


GTTGCCGGGT 


60 


TCGGGCCTCG 


GGTTGGCGAT 


CGTCAAACAG 


GTGGTGCTCA 


ACCACGGCGG 


ATTGCTGCGC 


120 


ATCGAAGACA 


CCGACCCAGG 


CGGCCAGCCC 


CCTGGAACGT 


CGATTTACGT 


GCTGCTCCCC 


180 


GGCCGTCGGA 


TGCCGATTCC 


GCAGCTTCCC 


GGTGCGACGG 


CTGGCGCTCG 


GAGCACGGAC 


240 


ATCGAGAACT 


CTCGGGGTTC 


GGCGAACGTT 


ATCTCAGTGG 


AATCTCAGTC 


CACGCGCGCA 


300 


ACCTAGTTGT 


GCAGTTACTG 


TTGAAAGCCA 


CACCCATGCC 


AGTCCACGCA 


TGGCCAAGTT 


360 


GGCCCGAGTA 


GTGGGCCTAG 


TACAGGAAGA 


GCAACCTAGC 


GACATGACGA 


ATCACCCACG 


420 


GTATTCGCCA 


CCGCCGCAGC 


AGCCGGGAAC 


CCCAGGTTAT 


GCTCAGGGGC 


AGCAGCAAAC 


480 


GTACAGCCAG 


CAGTTCGACT 


GGCGTTACCC 


ACCGTCCCCG 


CCCCCGCAGC 


CAACCCAGTA 


540 


CCGTCAACCC 


TACGAGGCGT 


TGGGTGGTAC 


CCGGCCGGGT 


CTGATACCTG 


GCGTGATTCC 


600 


GACCATGACG 


CCCCCTCCTG 


GGATGGTTCG 


CCAACGCCCT 


CGTGCAGGCA 


TGTTGGCCAT 


660 
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PGGCGCGGTG 


ACGATAGCGG 


TGGTGTCCGC 1 


CGGCATCGGC 


GGCGCGGCCG 


CATCCCTGGT 


720 


rr;GGTTCAAC 

\J \J x x unr>v 


CGGGCACCCG 


CCGGCCCCAG 


CGGCGGCCCA 


GTGGCTGCCA 


GCGCGGCGCC 


780 




GCAGCAAACA 


TGCCGCCGGG 


GTCGGTCGAA 


CAGGTGGCGG 


CCAAGGTGGT 


840 




GTCATGTTGG 


AAACCGATCT 


GGGCCGCCAG 


TCGGAGGAGG 


GCTCCGGCAT 


900 


l^r/T. i. 1 V_, J. O J. ^ X 


GCCGAGGGGC 


TGATCTTGAC 


CAACAACCAC 


GTGATCGCGG 


CGGCCGCCAA 


960 




GGCAGTCCGC 


CGCCGAAAAC 


GACGGTAACC 


TTCTCTGACG 


GGCGGACCGC 


1020 




GTGGTGGGGG 


CTGACCCCAC 


CAGTGATATC 


GCCGTCGTCC 


GTGTTCAGGG 


1080 




PTCACCCCGA 


TCTCCCTGGG 


TTCCTCCTCG 


GACCTGAGGG 


TCGGTCAGCC 


1140 




ATCGGGTCGC 


CGCTCGGTTT 


GGAGGGCACC 


GTGACCACGG 


GGATCGTCAG 


1200 




CGTCCAGTGT 


CGACGACCGG 


CGAGGCCGGC 


AACCAGAACA 


CCGTGCTGGA 


1260 




ACCGACGCCG 


CGATCAACCC 


CGGTAACTCC 


GGGGGCGCGC 


TGGTGAACAT 


1320 


bjrx/»L»0^ 1 ^/-x/-\ 


CTCGTCGGAG 


TCAACTCGGC 


CATTGCCACG 


CTGGGCGCGG 


ACTCAGCCGA 


1380 


TGCGCAGAGC 


GGCTCGATCG 


GTCTCGGTTT 


TGCGATTCCA 


GTCGACCAGG 


CCAAGCGCAT 


1440 


CGCCGACGAL> 


TTGATCAGCA 


CCGGCAAGGC 


GTCACATGCC 


TCCCTGGGTG 


TGCAGGTGAC 


1500 


CAATGACAAA 


GACACCCCGG 


GCGCCAAGAT 


CGTCGAAGTA 


GTGGCCGGTG 


GTGCTGCCGC 


1560 


GAACGCTGGA 


GTGCCGAAGG 


GCGTCGTTGT 


CACCAAGGTC 


GACGACCGCC 


CGATCAACAG 


1620 


CGCGGACGCG 


TTGGTTGCCG 


CCGTGCGGTC 


CAAAGCGCCG 


GGCGCCACGG 


TGGCGCTAAC 


1680 


CTTTCAGGAT 


CCCTCGGGCG 


GTAGCCGCAC 


AGTGCAAGTC 


ACCCTCGGCA AGGCGGAGCA 


1740 


GTGATGAAGG 


TCGCCGCGCA GTGTTCAAAG 


; c 






1771 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



CTCCACCGCG 


GTGGCGGCCG 


CTCTAGAACT 


AGTGGATCCC 


CCGGGCTGCA 


GGAATTCGGC 


60 


ACGAGGATCC 


GACGTCGCAG 


GTTGTCGAAC 


CCGCCGCCGC 


GGAAGTATCG 


GTCCATGCCT 


120 


AGCCCGGCGA 


CGGCGAGCGC 


CGGAATGGCG 


CGAGTGAGGA 


GGCGGGCAAT 


TTGGCGGGGC 


180 


CCGGCGACGG 


CGAGCGCCGG 


AATGGCGCGA 


GTGAGGAGGC 


GGGCAGTCAT 


GCCCAGCGTG 


240 


ATCCAATCAA 


CCTGCATTCG 


GCCTGCGGGC 


CCATTTGACA 


ATCGAGGTAG 


TGAGCGCAAA 


300 
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TGAATGATGG AAAACGGGCG GTGACGTCCG CTGTTCTGGT GGTGCTAGGT GCCTGCCTGG 3 60 

CGTTGTGGCT ATCAGGATGT TCTTCGCCGA AACCTGATGC CGAGGAACAG GGTGTTCCCG 4 20 

TGAGCCCGAC GGCGTCCGAC CCCGCGCTCC TCGCCGAGAT CAGGCAGTCG CTTGATGCGA 4 80 

CAAAAGGGTT GACCAGCGTG CACGTAGCGG TCCGAACAAC CGGGAAAGTC GACAGCTTGC 54 0 

TGGGTATTAC CAGTGCCGAT GTCGACGTCC GGGCCAATCC GCTCGCGGCA AAGGGCGTAT 600 

GCACCTACAA CGACGAGCAG GGTGTCCCGT TTCGGGTACA AGGCGACAAC ATCTCGGTGA 660 

AACTGTTCGA CGACTGGAGC AATCTCGGCT CGATTTCTGA ACTGTCAACT TCACGCGTGC 7 20 

TCGATCCTGC CGCTGGGGTG ACGCAGCTGC TGTCCGGTGT CACGAACCTC CAAGCGCAAG 780 

GTACCGAAGT GATAGACGGA ATTTCGACCA CCAAAATCAC CGGGACCATC CCCGCGAGCT 84 0 

CTGTCAAGAT GCTTGATCCT GGCGCCAAGA GTGCAAGGCC GGCGACCGTG TGGATTGCCC 900 

AGGACGGCTC GCACCACCTC GTCCGAGCGA GCATCGACCT CGGATCCGGG TCGATTCAGC 9 60 

TCACGCAGTC GAAATGGAAC GAACCCGTCA ACGTCGACTA GGCCGAAGTT GCGTCGACGC 1020 

GTTGNTCGAA ACGCCCTTGT GAACGGTGTC AACGGNAC 1058 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY : linear 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GAATTCGGCA CGAGAGGTGA TCGACATCAT CGGGACCAGC CCCACATCCT GGGAACAGGC 60 

GGCGGCGGAG GCGGTCCAGC GGGCGCGGGA TAGCGTCGAT GACATCCGCG TCGCTCGGGT 120 

CATTGAGCAG GACATGGCCG TGGACAGCGC CGGCAAGATC ACCTACCGCA TCAAGCTCGA 180 

AGTGTCGTTC AAGATGAGGC CGGCGCAACC GCGCTAGCAC GGGCCGGCGA GCAAGACGCA 24 0 

AAATCGCACG GTTTGCGGTT GATTCGTGCG ATTTTGTGTC TGCTCGCCGA GGCCTACCAG 300 

GCGCGGCCCA GGTCCGCGTG CTGCCGTATC CAGGCGTGCA TCGCGATTCC GGCGGCCACG 3 60 

CCGGAGTTAA TGCTTCGCGT CGACCCGAAC TGGGCGATCC GCCGGNGAGC TGATCGATGA 4 20 

CCGTGGCCAG CCCGTCGATG CCCGAGTTGC CCGAGGAAAC GTGCTGCCAG GCCGGTAGGA 4 80 

AGCGTCCGTA GGCGGCGGTG CTGACCGGCT CTGCCTGCGC CCTCAGTGCG GCCAGCGAGC 54 0 

GG 542 
(2) INFORMATION FOR SEQ ID NO: 16: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 913 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



CGGTGCCGCC 


CGCGCCTCCG 


TTGCCCCCAT 


TGCCGCCGTC 


GCCGATCAGC 


TGCGCATCGC 


60 


CACCATCACC 


GCCTTTGCCG 


CCGGCACCGC 


CGGTGGCGCC 


GGGGCCGCCG 


ATGCCACCGC 


120 


TTGACCCTGG 


CCGCCGGCGC 


CGCCATTGCC 


ATACAGCACC 


CCGCCGGGGG 


CACCGTTACC 


180 


GCCGTCGCCA 


CCGTCGCCGC 


CGCTGCCGTT 


TCAGGCCGGG 


GAGGCCGAAT 


GAACCGCCGC 


240 


CAAGCCCGCC 


GCCGGCACCG 


TTGCCGCCTT 


TTCCGCCCGC 




r* p ft: p p z\ zv t t n 


300 


CCGAACAGCC 


AMGCACCGTT 


GCCGCCAGCC 


CCGCCGCCGT 


TAACGGCGCT 


GCCGGGCGCC 


360 


GCCGCCGGAC 


CCGCCATTAC 


CGCCGTTCCC 


GTTCGGTGCC 


CCGCCGTTAC 


CGGCGCCGCC 


420 


GTTTGCCGCC 


AATATTCGGC 


GGGCACCGCC 


AGACCCGCCG 


GGGCCACCAT 


TGCCGCCGGG 


480 


CACCGAAACA 


ACAGCCCAAC 


GGTGCCGCCG 


GCCCCGCCGT 


TTGCCGCCAT 


CACCGGCCAT 


540 


TCACCGCCAG 


CACCGCCGTT 


AATGTTTATG 


AACCCGGTAC 


CGCCAGCGCG 


GCCCCTATTG 


600 


CCGGGCGCCG 


GAGNGCGTGC 


CCGCCGGCGC 


CGCCAACGCC 


CAAAAGCCCG 


GGGTTGCCAC 


660 


CGGCCCCGCC 


GGACCCACCG 


GTCCCGCCGA 


TCCCCCCGTT 


GCCGCCGGTG 


CCGCCGCCAT 


720 


TGGTGCTGCT 


GAAGCCGTTA 


GCGCCGGTTC 


CGCSGGTTCC 


GGCGGTGGCG 


CCNTGGCCGC 


780 


CGGCCCCGCC 


GTTGCCGTAC 


AGCCACCCCC 


CGGTGGCGCC 


GTTGCCGCCA 


TTGCCGCCAT 


840 


TGCCGCCGTT 


GCCGCCATTG 


CCGCCGTTCC 


CGCCGCCACC 


GCCGGNTTGG 


CCGCCGGCGC 


900 
913 



CGCCGGCGGC CGC 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GACTACGTTG GTGTAGAAAA ATCCTGCCGC CCGGACCCTT AAGGCTGGGA CAATTTCTGA 
TAGCTACCCC GACACAGGAG GTTACGGGAT GAGCAATTCG CGCCGCCGCT CACTCAGGTG 
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GTCATGGTTG 


CTGAGCGTGC 


TGGCTGCCGT 


CGGGCTGGGC 


CTGGCCACGG 


CGCCGGCCCA 


180 


GGCGGCCCCG 


CCGGCCTTGT 


CGCAGGACCG 


GTTCGCCGAC 


TTCCCCGCGC 


TGCCCCTCGA 


240 


CCCGTCCGCG 


ATGGTCGCCC 


AAGTGGCGCC 


ACAGGTGGTC 


AACATCAACA 


CCAAACTGGG 


300 


CTACAACAAC 


GCCGTGGGCG 


CCGGGACCGG 


CATCGTCATC 


GATCCCAACG 


GTGTCGTGCT 


360 


GACCAACAAC 


CACGTGATCG 


CGGGCGCCAC 


CGACATCAAT 


GCGTTCAGCG 


TCGGCTCCGG 


420 


CCAAACCTAC 


GGCGTCGATG 


TGGTCGGGTA 


TGACCGCACC 


CAGGATGTCG 


CGGTGCTGCA 


480 


GCTGCGCGGT 


GCCGGTGGCC 


TGCCGTCGGC 


GGCGATCGGT 


GGCGGCGTCG 


CGGTTGGTGA 


540 


GCCCGTCGTC 


GCGATGGGCA 


ACAGCGGTGG 


GCAGGGCGGA 


ACGCCCCGTG 


CGGTGCCTGG 


600 


CAGGGTGGTC 


GCGCTCGGCC 


AAACCGTGCA 


GGCGTCGGAT 


TCGCTGACCG 


GTGCCGAAGA 


660 


GACATTGAAC 


GGGTTGATCC 


AGTTCGATGC 


CGCAATCCAG 


CCCGGTGATT 


CGGGCGGGCC 


720 


CGTCGTCAAC 


GGCCTAGGAC 


AGGTGGTCGG 


TATGAACACG 


GCCGCGTCCG 


ATAACTTCCA 


780 


GCTGTCCCAG 


GGTGGGCAGG 


GATTCGCCAT 


TCCGATCGGG 


CAGGCGATGG 


CGATCGCGGG 


840 


CCAAATCCGA 


TCGGGTGGGG 


GGTCACCCAC 


CGTTCATATC 


GGGCCTACCG 


CCTTCCTCGG 


900 


CTTGGGTGTT 


GTCGACAACA 


ACGGCAACGG 


CGCACGAGTC 


CAACGCGTGG 


TCGGAAGCGC 


960 


TCCGGCGGCA 


AGTCTCGGCA 


TCTCCACCGG 


CGACGTGATC 


ACCGCGGTCG 


ACGGCGCTCC 


1020 


GATCAACTCG 


GCCACCGCGA 


TGGCGGACGC 


GCTTAACGGG 


CATCATCCCG 


GTGACGTCAT 


1080 


CTCGGTGAAC 


TGGCAAACCA 


AGTCGGGCGG 


CACGCGTACA 


GGGAACGTGA 


CATTGGCCGA 


1140 


GGGACCCCCG 


GCCTGATTTG 


TCGCGGATAC 


CACCCGCCGG 


CCGGCCAATT 


GGATTGGCGC 


1200 


CAGCCGTGAT 


TGCCGCGTGA 


GCCCCCGAGT 


TCCGTCTCCC 


GTGCGCGTGG 


CATTGTGGAA 


1260 


GCAATGAACG 


AGGCAGAACA 


CAGCGTTGAG 


CACCCTCCCG 


TGCAGGGCAG 


TTACGTCGAA 


1320 


GGCGGTGTGG 


TCGAGCATCC 


GGATGCCAAG 


GACTTCGGCA 


GCGCCGCCGC 


CCTGCCCGCC 


1380 


GATCCGACCT 


GGTTTAAGCA 


CGCCGTCTTC 


TACGAGGTGC 


TGGTCCGGGC 


GTTCTTCGAC 


1440 


GCCAGCGCGG 


ACGGTTCCGN 


CGATCTGCGT 


GGACTCATCG 


ATCGCCTCGA 


CTACCTGCAG 


1500 


TGGCTTGGCA 


TCGACTGCAT 


CTGTTGCCGC 


CGTTCCTACG 


ACTCACCGCT 


GCGCGACGGC 


1560 


GGTTACGACA 


TTCGCGACTT 


CTACAAGGTG 


CTGCCCGAAT 


TCGGCACCGT 


CGACGATTTC 


1620 


GTCGCCCTGG 


TCGACACCGC 


TCACCGGCGA 


GGTATCCGCA 


TCATCACCGA 


CCTGGTGATG 


1680 


AATCACACCT 


CGGAGTCGCA 


CCCCTGGTTT 


CAGGAGTCCC 


GCCGCGACCC 


AGACGGACCG 


1740 


TACGGTGACT 


ATTACGTGTG 


GAGCGACACC 


AGCGAGCGCT 


ACACCGACGC 


CCGGATCATC 


1800 


TTCGTCGACA 


CCGAAGAGTC 


GAACTGGTCA 


TTCGATCCTG 


TCCGCCGACA 


GTTNCTACTG 


1860 


GCACCGATTC 


TT 










1872 



(2) INFORMATION FOR SEQ ID NO: 18: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1482 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CTTCGCCGAA ACCTGATGCC GAGGAACAGG GTGTTCCCGT GAGCCCGACG GCGTCCGACC 
CCGCGCTCCT CGCCGAGATC AGGCAGTCGC TTGATGCGAC AAAAGGGTTG ACCAGCGTGC 
ACGTAGCGGT CCGAACAACC GGGAAAGTCG ACAGCTTGCT GGGTATTACC AGTGCCGATG 
TCGACGTCCG GGCCAATCCG CTCGCGGCAA AGGGCGTATG CACCTACAAC GACGAGCAGG 
GTGTCCCGTT TCGGGTACAA GGCGACAACA TCTCGGTGAA ACTGTTCGAC GACTGGAGCA 
ATCTCGGCTC GATTTCTGAA CTGTCAACTT CACGCGTGCT CGATCCTGCC GCTGGGGTGA 
CGCAGCTGCT GTCCGGTGTC ACGAACCTCC AAGCGCAAGG TACCGAAGTG AT AG AC G G AA 
TTTCGACCAC CAAAATCACC GGGACCATCC CCGCGAGCTC TGTCAAGATG CTTGATCCTG 
GCGCCAAGAG TGCAAGGCCG GCGACCGTGT GGATTGCCCA GGACGGCTCG CACCACCTCG 
TCCGAGCGAG CATCGACCTC GGATCCGGGT CGATTCAGCT CACGCAGTCG AAATGGAACG 
AACCCGTCAA CGTCGACTAG GCCGAAGTTG CGTCGACGCG TTGCTCGAAA CGCCCTTGTG 
AACGGTGTCA ACGGCACCCG AAAACTGACC CCCTGACGGC ATCTGAAAAT TGACCCCCTA 
GACCGGGCGG TTGGTGGTTA TTCTTCGGTG GTTCCGGCTG GTGGGACGCG GCCGAGGTCG 
CGGTCTTTGA GCCGGTAGCT GTCGCCTTTG AGGGCGACGA CTTCAGCATG GTGGACGAGG 840 
CGGTCGATCA TGGCGGCAGC AACGACGTCG TCGCCGCCGA AAACCTCGCC CCACCGGCCG 900 
AAGGCCTTAT TGGACGTGAC GATCAAGCTG GCCCGCTCAT ACCGGGAGGA CACCAGCTGG 960 
AAGAAGAGGT TGGCGGCCTC GGGCTCAAAC GGAATGTAAC CGACTTCGTC AACCACCAGG 
AGCGGATAGC GGCCAAACCG GGTGAGTTCG GCGTAGATGC GCCCGGCGTG GTGAGCCTCG 
GCGAACCGTG CTACCCATTC GGCGGCGGTG GCGAACAGCA CCCGATGACC GGCCTGACAC 
GCGCGTATCG CCAGGCCGAC CGCAAGATGA GTCTTCCCGG TGCCAGGCGG GGCCCAAAAA 
CACGACGTTA TCGCGGGCGG TGATGAAATC CAGGGTGCCC AGATGTGCGA TGGTGTCGCG 12 60 

TTTGAGGCCA CGAGCATGCT CAAAGTCGAA CTCTTCCAAC GACTTCCGAA CCGGGAAGCG 
GGCGGCGCGG ATGCGGCCCT CACCACCATG GGACTCCCGG GCTGACACTT CCCGCTGCAG 
GCAGGCGGCC AGGTATTCTT CGTGGCTCCA GTTCTCGGCG CGGGCGCGAT CGGCCAGCCG 
GGACACTGAC TCACGCAGGG TGGGAGCTTT CAATGCTCTT GT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



1020 
1080 
1140 
1200 



1320 
1380 
1440 
1482 



BNSDOCID: <WO 9816646A2_I_> 



WO 98/16646 



PCT7US97/18293 



75 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 9 : 

GAATTCGGCA CGAGCCGGCG ATAGCTTCTG GGCCGCGGCC GACCAGATGG CTCGAGGGTT 60 

CGTGCTCGGG GCCACCGCCG GGCGCACCAC CCTGACCGGT GAGGGCCTGC AACACGCCGA 120 

CGGTCACTCG TTGCTGCTGG ACGCCACCAA CCCGGCGGTG GTTGCCTACG ACCCGGCCTT 180 

CGCCTACGAA ATCGGCTACA TCGNGGAAAG CGGACTGGCC AGGATGTGCG GGGAGAACCC 24 0 

GGAGAACATC TTCTTCTACA TCACCGTCTA CAACGAGCCG TACGTGCAGC CGCCGGAGCC 300 

GGAGAACTTC GATCCCGAGG GCGTGCTGGG GGGTATCTAC CGNTATCACG CGGCCACCGA 360 

GCAACGCACC AACAAGGNGC AGATCCTGGC CTCCGGGGTA GCGATGCCCG CGGCGCTGCG 4 20 

GGCAGCACAG ATGCTGGCCG CCGAGTGGGA TGTCGCCGCC GACGTGTGGT CGGTGACCAG 4 80 

TTGGGGCGAG CTAAACCGCG ACGGGGTGGT CATCGAGACC GAGAAGCTCC GCCACCCCGA 54 0 

TCGGCCGGCG GGCGTGCCCT ACGTGACGAG AGCGCTGGAG AATGCTCGGG GCCCGGTGAT 600 

CGCGGTGTCG GACTGGATGC GCGCGGTCCC CGAGCAGATC CGACCGTGGG TGCCGGGCAC 660 

ATACCTCACG TTGGGCACCG ACGGGTTCGG TTTTTCCGAC ACTCGGCCCG CCGGTCGTCG 72 0 

TTACTTCAAC ACCGACGCCG AATCCCAGGT TGGTCGCGGT TTTGGGAGGG GTTGGCCGGG 7 80 

TCGACGGGTG AATATCGACC CATTCGGTGC CGGTCGTGGG CCGCCCGCCC AGTTACCCGG 84 0 

ATTCGACGAA GGTGGGGGGT TGCGCCCGAN TAAGTT 87 6 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
ATCCCCCCGG GCTGCAGGAA TTCGGCACGA GAGACAAAAT TCCACGCGTT AATGCAGGAA 60 
CAGATTCATA ACGAATTCAC AGCGGCACAA CAATATGTCG CGATCGCGGT TTATTTCGAC 120 
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AGCGAAGACC TGCCGCAGTT GGCGAAGCAT TTTTACAGCC AAGCGGTCGA GGAACGAAAC 180 

CATGCAATGA TGCTCGTGCA ACACCTGCTC GACCGCGACC TTCGTGTCGA AATTCCCGGC 24 0 

GTAGACACGG TGCGAAACCA GTTCGACAGA CCCCGCGAGG CACTGGCGCT GGCGCTCGAT 300 

CAGGAACGCA CAGTCACCGA CCAGGTCGGT CGGCTGACAG CGGTGGCCCG CGACGAGGGC 360 

GATTTCCTCG GCGAGCAGTT CATGCAGTGG TTCTTGCAGG AACAGATCGA AGAGGTGGCC 4 20 

TTGATGGCAA CCCTGGTGCG GGTTGCCGAT CGGGCCGGGG CCAACCTGTT CGAGCTAGAG 4 80 

AACTTCGTCG CACGTGAAGT GGATGTGGCG CCGGCCGCAT CAGGCGCCCC GCACGCTGCC 54 0 

GGGGGCCGCC TCTAGATCCC TGGGGGGGAT CAGCGAGTGG TCCCGTTCGC CCGCCCGTCT 600 

TCCAGCCAGG CCTTGGTGCG GCCGGGGTGG TGAGTACCAA TCCAGGCCAC CCCGACCTCC 6 60 

CGGNAAAAGT CGATGTCCTC GTACTCATCG ACGTTCCAGG AGTACACCGC CCGGCCCTGA 7 20 

GCTGCCGAGC GGTCAACGAG TTGCGGATAT TCCTTTAACG CAGGCAGTGA GGGTCCCACG 7 80 

GCGGTTGGCC CGACCGCCGT GGCCGCACTG CTGGTCAGGT ATCGGGGGGT CTTGGCGAGC 84 0 

AACAACGTCG GCAGGAGGGG TGGAGCCCGC CGGATCCGCA GACCGGGGGG GCGAAAACGA 900 

CATCAACACC GCACGGGATC GATCTGCGGA GGGGGGTGCG GGAATACCGA ACCGGTGTAG 960 

GAGCGCCAGC AGTTGTTTTT CCACCAGCGA AGCGTTTTCG GGTCATCGGN GGCNNTTAAG 1020 



1021 



T 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CGTGCCGACG AACGGAAGAA CACAACCATG AAGATGGTGA AATCGATCGC CGCAGGTCTG 60 
ACCGCCGCGG CTGCAATCGG CGCCGCTGCG GCCGGTGTGA CTTCGATCAT GGCTGGCGGN 120 
CCGGTCGTAT ACCAGATGCA GCCGGTCGTC TTCGGCGCGC CACTGCCGTT GGACCCGGNA 180 
TCCGCCCCTG ANGTCCCGAC CGCCGCCCAG TGGACCAGNC TGCTCAACAG NCTCGNCGAT 24 0 

CCCAACGTGT CGTTTGNGAA CAAGGGNAGT CTGGTCGAGG GNGGNATCGG NGGNANCGAG 300 
GGNGNGNATC GNCGANCACA A 
(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 



321 
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(A) LENGTH: 373 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



TCTTATCGGT 


TCCGGTTGGC 


GACGGGTTTT 


GGGNGCGGGT 


GGTTAACCCG 


CTCGGCCAGC 


60 


CGATCGACGG 


GCGCGGAGAC 


GTCGACTCCG 


ATACTCGGCG 


CGCGCTGGAG 


CTCCAGGCGC 


120 


CCTCGGTGGT 


GNACCGGCAA 


GGCGTGAAGG 


AGCCGTTGNA 


GACCGGGATC 


AAGGCGATTG 


180 


ACGCGATGAC 


CCCGATCGGC 


CGCGGGCAGC 


GCCAGCTGAT 


CATCGGGGAC 


CGCAAGACCG 


240 


GCAAAAACCG 


CCGTCTGTGT 


CGGACACCAT 


CCTCAAACCA 


GCGGGAAGAA 


CTGGGAGTCC 


300 


GGTGGATCCC 


AAGAAGCAGG 


TGCGCTTGTG 


TATACGTTGG 


CCATCGGGCA 


AGAAGGGGAA 


360 


CTTACCATCG 


CCG 










373 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GTGACGCCGT GATGGGATTC CTGGGCGGGG CCGGTCCGCT GGCGGTGGTG GATCAGCAAC 60 

TGGTTACCCG GGTGCCGCAA GGCTGGTCGT TTGCTCAGGC AGCCGCTGTG CCGGTGGTGT 12 0 

TCTTGACGGC CTGGTACGGG TTGGCCGATT TAGCCGAGAT CAAGGCGGGC GAATCGGTGC 18 0 

TGATCCATGC CGGTACCGGC GGTGTGGGCA TGGCGGCTGT GCAGCTGGCT CGCCAGTGGG 24 0 

GCGTGGAGGT TTTCGTCACC GCCAGCCGTG GNAAGTGGGA CACGCTGCGC GCCATNGNGT 300 

TTGACGACGA NCCATATCGG NGATTCCCNC ACATNCGAAG TTCCGANGGA GA 352 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GAAATCCGCG TTCATTCCGT TCGACCAGCG GCTGGCGATA ATCGACGAAG TGATCAAGCC 
GCGGTTCGCG GCGCTCATGG GTCACAGCGA GTAATCAGCA AGTTCTCTGG TATATCGCAC 
CTAGCGTCCA GTTGCTTGCC AGATCGCTTT CGTACCGTCA TCGCATGTAC CGGTTCGCGT 
GCCGCACGCT CATGCTGGCG GCGTGCATCC TGGCCACGGG TGTGGCGGGT CTCGGGGTCG 
GCGCGCAGTC CGCAGCCCAA ACCGCGCCGG TGCCCGACTA CTACTGGTGC CCGGGGCAGC 
CTTTCGACCC CGCATGGGGG CCCAACTGGG ATCCCTACAC CTGCCATGAC GACTTCCACC 
GCGACAGCGA CGGCCCCGAC CACAGCCGCG ACTACCCCGG ACCCATCCTC GAAGGTCCCG 
TGCTTGACGA TCCCGGTGCT GCGCCGCCGC CCCCGGCTGC CGGTGGCGGC GCATAGCGCT 
CGTTGACCGG GCCGCATCAG CGAATACGCG TATAAACCCG GGCGTGCCCC CGGCAAGCTA 
CGACCCCCGG CGGGGCAGAT TTACGCTCCC GTGCCGATGG ATCGCGCCGT CCGATGACAG 
AAAATAGGCG ACGGTTTTGG CAACCGCTTG GAGGACGCTT GAAGGGAACC TGTCATGAAC 
GGCGACAGCG CCTCCACCAT CGACATCGAC AAGGTTGTTA CCCGCACACC CGTTCGCCGG 
ATCGTG 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
726 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



CGCGACGACG 


ACGAACGTCG 


GGCCCACCAC 


CGCCTATGCG 


TTGATGCAGG 


CGACCGGGAT 


60 


GGTCGCCGAC 


CATATCCAAG 


CATGCTGGGT 


GCCCACTGAG 


CGACCTTTTG 


ACCAGCCGGG 


120 


CTGCCCGATG 


GCGGCCCGGT 


GAAGTCATTG 


CGCCGGGGCT 


TGTGCACCTG 


ATGAACCCGA 


180 


ATAGGGAACA 


ATAGGGGGGT 


GATTTGGCAG 


TTCAATGTCG 


GGTATGGCTG 


GAAATCCAAT 


240 


GGCGGGGCAT 


GCTCGGCGCC 


GACCAGGCTC 


GCGCAGGCGG 


GCCAGCCCGA 


ATCTGGAGGG 


300 


AGCACTCAAT 


GGCGGCGATG 


AAGCCCCGGA 


CCGGCGACGG 


TCCTTTGGAA 


GCAACTAAGG 


360 


AGGGGCGCGG 


CATTGTGATG 


CGAGTACCAC 


TTGAGGGTGG 


CGGTCGCCTG 


GTCGTCGAGC 


420 


TGACACCCGA 


CGAAGCCGCC 


GCACTGGGTG 


ACGAACTCAA 


AGGCGTTACT 


AGCTAAGACC 


480 


AGCCCAACGG 


CGAATGGTCG 


GCGTTACGCG 


CACACCTTCC 


GGTAGATGTC 


CAGTGTCTGC 


540 


TCGGCGATGT 


ATGCCCAGGA 


GAACTCTTGG 


ATACAGCGCT 






580 
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(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
AACGGAGGCG CCGGGGGTTT TGGCGGGGCC GGGGCGGTCG GCGGCAACGG CGGGGCCGGC 60 
GGTACCGCCG GGTTGTTCGG TGTCGGCGGG GCCGGTGGGG CCGGAGGCAA CGGCATCGCC 120 
GGTGTCACGG GTACGTCGGC CAGCACACCG GGTGGATCCG 160 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

GACACCGATA CGATGGTGAT GTACGCCAAC GTTGTCGACA CGCTCGAGGC GTTCACGATC 60 

CAGCGCACAC CCGACGGCGT GACCATCGGC GATGCGGCCC CGTTCGCGGA GGCGGCTGCC 120 

AAGGCGATGG GAATCGACAA GCTGCGGGTA ATTCATACCG GAATGGACCC CGTCGTCGCT 180 

GAACGCGAAC AGTGGGACGA CGGCAACAAC ACGTTGGCGT TGGCGCCCGG TGTCGTTGTC 24 0 

GCCTACGAGC GCAACGTACA GACCAACGCC CG 27 2 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 317 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GCAGCCGGTG GTTCTCGGAC TATCTGCGCA CGGTGACGCA GCGCGACGTG CGCGAGCTGA 60 
AGCGGATCGA GCAGACGGAT CGCCTGCCGC GGTTCATGCG CTACCTGGCC GCTATCACCG 120 
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CGCAGGAGCT GAACGTGGCC GAAGCGGCGC GGGTCATCGG GGTCGACGCG GGGACGATCC 
GTTCGGATCT GGCGTGGTTC GAGACGGTCT ATCTGGTACA TCGCCTGCCC GCCTGGTCGC 
GGAATCTGAC CGCGAAGATC AAGAAGCGGT CAAAGATCCA CGTCGTCGAC AGTGGCTTCG 
CGGCCTGGTT GCGCGGG 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
GATCGTGGAG CTGTCGATGA ACAGCGTTGC CGGACGCGCG GCGGCCAGCA CGTCGGTGTA 
GCAGCGCCGG ACCACCTCGC CGGTGGGCAG CATGGTGATG ACCACGTCGG CCTCGGCCAC 
CGCTTCGGGC GCGCTACGAA ACACCGCGAC ACCGTGCGCG GCGGCGCCGG ACGCCGCCGT 



GG 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 308 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
180 
182 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



GATCGCGAAG 


TTTGGTGAGC 


AGGTGGTCGA 


CGCGAAAGTC 


TGGGCGCCTG 


CGAAGCGGGT 


60 


CGGCGTTCAC 


GAGGCGAAGA 


CACGCCTGTC 


CGAGCTGCTG 


CGGCTCGTCT 


ACGGCGGGCA 


120 


GAG GTT GAGA 


TTGCCCGCCG 


CGGCGAGCCG 


GTAGCAAAGC 


TTGTGCCGCT 


GCATCCTCAT 


180 


GAGACTCGGC 


GGTTAGGCAT 


TGACCATGGC 


GTGTACCGCG 


TGCCCGACGA 


TTTGGACGCT 


240 


CCGTTGTCAG 


ACGACGTGCT 


CGAACGCTTT 


CACCGGTGAA 


GCGCTACCTC 


ATCGACACCC 


300 
308 



ACGTTTGG 

(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

CCGACGACGA GCAACTCACG TGGATGATGG TCGGCAGCGG CATTGAGGAC GGAGAGAATC 60 

CGGCCGAAGC TGCCGCGCGG CAAGTGCTCA TAGTGACCGG CCGTAGAGGG CTCCCCCGAT 120 

GGCACCGGAC TATTCTGGTG TGCCGCTGGC CGGTAAGAGC GGGTAAAAGA ATGTGAGGGG 180 

ACACGATGAG CAATCACACC TACCGAGTGA TCGAGATCGT CGGGACCTCG CCCGACGGCG 24 0 

TCGACGCGGC AATCCAGGGC GGTCTGG 2 67 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1539 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

CTCGTGCCGA AAGAATGTGA GGGGACACGA TGAGCAATCA CACCTACCGA GTGATCGAGA 60 

TCGTCGGGAC CTCGCCCGAC GGCGTCGACG CGGCAATCCA GGGCGGTCTG GCCCGAGCTG 120 

CGCAGACCAT GCGCGCGCTG GACTGGTTCG AAGTACAGTC AATTCGAGGC CACCTGGTCG 180 

ACGGAGCGGT CGCGCACTTC CAGGTGACTA TGAAAGTCGG CTTCCGCTGG AGGATTCCTG 24 0 

AACCTTCAAG CGCGGCCGAT AACTGAGGTG CATCATTAAG CGACTTTTCC AGAACATCCT 300 
GACGCGCTCG AAACGCGGTT CAGCCGACGG TGGCTCCGCC GAGGCGCTGC CTCCAAAATC - 3 60 

CCTGCGACAA TTCGTCGGCG GCGCCTACAA GGAAGTCGGT GCTGAATTCG TCGGGTATCT 4 20 

GGTCGACCTG TGTGGGCTGC AGCCGGACGA AGCGGTGCTC GACGTCGGCT GCGGCTCGGG 4 80 

GCGGATGGCG TTGCCGCTCA CCGGCTATCT GAACAGCGAG GGACGCTACG CCGGCTTCGA 54 0 

TATCTCGCAG AAAGCCATCG CGTGGTGCCA GGAGCACATC ACCTCGGCGC ACCCCAACTT 600 

CCAGTTCGAG GTCTCCGACA TCTACAACTC GCTGTACAAC CCGAAAGGGA AATACCAGTC 660 

ACTAGACTTT CGCTTTCCAT ATCCGGATGC GTCGTTCGAT GTGGTGTTTC TTACCTCGGT 720 

GTTCACCCAC ATGTTTCCGC CGGACGTGGA GCACTATCTG GACGAGATCT CCCGCGTGCT 78 0 

GAAGCCCGGC GGACGATGCC TGTGCACGTA CTTCTTGCTC AATGACGAGT CGTTAGCCCA 84 0 

CATCGCGGAA GGAAAGAGTG CGCACAACTT CCAGCATGAG GGACCGGGTT ATCGGACAAT 900 
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CCACAAGAAG CGGCCCGAAG AAGCAATCGG CTTGCCGGAG ACCTTCGTCA GGGATGTCTA 960 

TGGCAAGTTC GGCCTCGCCG TGCACGAACC ATTGCACTAC GGCTCATGGA GTGGCCGGGA 1020 

ACCACGCCTA AGCTTCCAGG ACATCGTCAT CGCGACCAAA ACCGCGAGCT AGGTCGGCAT 1080 

CCGGGAAGCA TCGCGACACC GTGGCGCCGA GCGCCGCTGC CGGCAGGCCG ATTAGGCGGG 114 0 

CAGATTAGCC CGCCGCGGCT CCCGGCTCCG AGTACGGCGC CCCGAATGGC GTCACCGGCT 1200 

GGTAACCACG CTTGCGCGCC TGGGCGGCGG CCTGCCGGAT CAGGTGGTAG ATGCCGACAA 1260 

AGCCTGCGTG ATCGGTCATC ACCAACGGTG ACAGCAGCCG GTTGTGCACC AGCGCGAACG 1320 

CCACCCCGGT CTCCGGGTCT GTCCAGCCGA TCGAGCCGCC CAAGCCCACA TGACCAAACC 1380 

CCGGCATCAC GTTGCCGATC GGCATACCGT GATAGCCAAG ATGAAAATTT AAGGGCACCA 14 4 0 

ATAGATTTCG ATCCGGCAGA ACTTGCCGTC GGTTGCGGGT CAGGCCCGTG ACCAGCTCCC 1500 

GCGACAAGAA CCGTATGCCG TCGATCTCGC CTCGTGCCG 1539 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 851 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



CTGCAGGGTG 


GCGTGGATGA 


GCGTCACCGC 


GGGGCAGGCC 


GAGCTGACCG 


CCGCCCAGGT 


60 


CCGGGTTGCT 


GCGGCGGCCT 


ACGAGACGGC 


GTATGGGCTG 


ACGGTGCCCC 


CGCCGGTGAT 


120 


CGCCGAGAAC 


CGTGCTGAAC 


TGATGATTCT 


GATAGCGACC 


AACCTCTTGG 


GGCAAAACAC 


180 


CCCGGCGATC 


GCGGTCAACG 


AGGCCGAATA 


CGGCGAGATG 


TGGGCCCAAG 


ACGCCGCCGC 


240 


GATGTTTGGC 


TACGCCGCGG 


CGACGGCGAC 


GGCGACGGCG 


ACGTTGCTGC 


CGTTCGAGGA 


300 


GGCGCCGGAG 


ATGACCAGCG 


CGGGTGGGCT 


CCTCGAGCAG 


GCCGCCGCGG 


TCGAGGAGGC 


360 


CTCCGACACC 


GCCGCGGCGA 


ACCAGTTGAT 


GAACAATGTG 


CCCCAGGCGC 


TGAAACAGTT 


420 


GGCCCAGCCC 


ACGCAGGGCA 


CCACGCCTTC 


TTCCAAGCTG 


GGTGGCCTGT 


GGAAGACGGT 


480 


CTCGCCGCAT 


CGGTCGCCGA 


TCAGCAACAT 


GGTGTCGATG 


GCCAACAACC 


ACATGTCGAT 


540 


GACCAACTCG 


GGTGTGTCGA 


TGACCAACAC 


CTTGAGCTCG 


ATGTTGAAGG 


GCTTTGCTCC 


600 


GGCGGCGGCC 


GCCCAGGCCG 


TGCAAACCGC 


GGCGCAAAAC 


GGGGTCCGGG 


CGATGAGCTC 


660 


GCTGGGCAGC 


TCGCTGGGTT 


CTTCGGGTCT 


GGGCGGTGGG 


GTGGCCGCCA 


ACTTGGGTCG 


720 


GGCGGCCTCG 


GTACGGTATG 


GTCACCGGGA 


TGGCGGAAAA 


TATGCANAGT 


CTGGTCGGCG 


780 
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GAACGGTGGT CCGGCGTAAG GTTTACCCCC GTTTTCTGGA TGCGGTGAAC TTCGTCAACG 84 0 

GAAACAGTTA C 8 51 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 254 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GATCGATCGG GCGGAAATTT GG ACCAGATT CGCCTCCGGC GATAACCCAA TCAATCGAAC 60 

CTAGATTTAT TCCGTCCAGG GGCCCGAGTA ATGGCTCGCA GGAGAGGAAC CTTACTGCTG 120 

CGGGCACCTG TCGTAGGTCC TCGATACGGC GGAAGGCGTC GACATTTTCC ACCGACACCC 180 

CCATCCAAAC GTTCGAGGGC CACTCCAGCT TGTGAGCGAG GCGACGCAGT CGCAGGCTGC 24 0 

GCTTGGTCAA GATC 254 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

GATCCTGACC GAAGCGGCCG CCGCCAAGGC GAAGTCGCTG TTGGACCAGG AGGGACGGGA 60 

CGATCTGGCG CTGCGGATCG CGGTTCAGCC GGGGGGGTGC GCTGGATTGC GCTATAACCT 120 

TTTCTTCGAC GACCGGACGC TGGATGGTGA CCAAACCGCG GAGTTCGGTG GTGTCAGGTT 180 

GATCGTGGAC CGGATGAGCG CGCCGTATGT GGAAGGCGCG TCGATCGATT TCGTCGACAC 24 0 

TATTGAGAAG CAAGGTTCAC CATCGACAAT CCCAACGCCA CCGGCTCCTG CGCGTGCGGG 300 

GATTCGTTCA ACTGATAAAA CGCTAGTACG ACCCCGCGGT GCGCAACACG TACGAGCACA 360 

CCAAGACCTG ACCGCGCTGG AAAAGCAACT GAGCGATGCC TTGCACCTGA CCGCGTGGCG 4 20 

GGCCGCCGGC GGCAGGTGTC ACCTGCATGG TGAACAGCAC CTGGGCCTGA TATTGCGACC 4 80 

AGTACACGAT TTTGTCGATC GAGGTCACTT CGACCTGGGA GAACTGCTTG CGGAACGCGT 54 0 

CGCTGCTCAG CTTGGCCAAG GCCTGATCGG AGCGCTTGTC GCGCACGCCG TCGTGGATAC 600 
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660 
720 
780 
840 
900 
960 
1020 



CGCACAGCGC ATTGCGAACG ATGGTGTCCA CATCGCGGTT CTCCAGCGCG TTGAGGTATC 
CCTGAATCGC GGTTTTGGCC GGTCCCTCCG AGAATGTGCC TGCCGTGTTG GCTCCGTTGG 
TGCGGACCCC GTATATGATC GCCGCCGTCA TAGCCGACAC CAGCGCGAGG GCTACCACAA 
TGCCGATCAG CAGCCGCTTG TGCCGTCGCT TCGGGTAGGA CACCTGCGGC GGCACGCCGG 
GATATGCGGC GGGCGGCAGC GCCGCGTCGT CTGCCGGTCC CGGGGCGAAG GCCGGTTCGG 
CGGCGCCGAG GTCGTGGGGG TAGTCCAGGG CTTGGGGTTC GTGGGATGAG GGCTCGGGGT 
ACGGCGCCGG TCCGTTGGTG CCGACACCGG GGTTCGGCGA GTGGGGACCG GGCATTGTGG 
TTCTCCTAGG GTGGTGGACG GGACCAGCTG CTAGGGCGAC AACCGCCCGT CGCGTCAGCC 1080 
GGCAGCATCG GCAATCAGGT GAGCTCCCTA GGCAGGCTAG CGCAACAGCT GCCGTCAGCT 114 0 

CTCAACGCGA CGGGGCGGGC CGCGGCGCCG ATAATGTTGA AAGACTAGGC AACCTTAGGA 1200 
ACGAAGGACG GAGATTTTGT GACGATC 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



1227 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGGGCCGGC GGGGCCGGCG 60 
GGACCGGCGC TAACGGTGGT GCCGGCGGCA ACGCCTGGTT GTTCGGGGCC GGCGGGTCCG 120 
GCGGNGCCGG CACCAATGGT GGNGTCGGCG GGTCCGGCGG ATTTGTCTAC GGCAACGGCG 180 

181 

G 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGTGTCGGC GGCCGGGGCG 
GCGACGGCGT CTTTGCCGGT GCCGGCGGCC AGGGCGGCCT CGGTGGGCAG GGCGGCAATG 
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GCGGCGGCTC CACCGGCGGC AACGGCGGTC TTGGCGGCGC GGGCGGTGGC GGAGGCAACG 



180 



CCCCGGACGG CGGCTTCGGT GGCAACGGCG GTAAGGGTGG CCAGGGCGGN ATTGGCGGCG 



240 



GCACTCAGAG CGCGACCGGC CTCGGNGGTG ACGGCGGTGA CGGCGGTGAC 



290 



(2) INFORMATION FOR SEQ ID NO: 38: 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



GATCCAGTGG CATGGNGGGT GTCAGTGGAA GCAT 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GATCGCTGCT CGTCCCCCCC TTGCCGCCGA CGCCACCGGT CCCACCGTTA CCGAACAAGC 60 
TGGCGTGGTC GCCAGCACCC CCGGCACCGC CGACGCCGGA GTCGAACAAT GGCACCGTCG 120 
TATCCCCACC ATTGCCGCCG GNCCCACCGG CACCG 155 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

ATGGCGTTCA CGGGGCGCCG GGGACCGGGC AGCCCGGNGG GGCCGGGGGG TGG 53 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GATCCACCGC GGGTGCAGAC GGTGCCCGCG GCGCCACCCC GACCAGCGGC GGCAACGGCG 
GCACCGGCGG CAACGGCGCG AACGCCACCG TCGTCGGNGG GGCCGGCGGG GCCGGCGGCA 
AGGGCGGCAA CG 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
132 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GATCGGCGGC CGGNACGGNC GGGGACGGCG GCAAGGGCGG NAACGGGGGC GCCGNAGCCA 
CCNGCCAAGA ATCCTCCGNG TCCNCCAATG GCGCGAATGG CGGACAGGGC GGCAACGGCG 
GCANCGGCGG CA 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: "702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
132 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 



CGGCACGAGG 


ATCGGTACCC 


CGCGGCATCG 


GCAGCTGCCG 


ATTCGCCGGG 


TTTCCCCACC 


60 


CGAGGAAAGC 


CGCTACCAGA 


TGGCGCTGCC 


GAAGTAGGGC 


GATCCGTTCG 


CGATGCCGGC 


120 


ATGAACGGGC 


GGCATCAAAT 


TAGTGCAGGA 


ACCTTTCAGT 


TTAGCGACGA 


TAATGGCTAT 


180 


AGCACTAAGG 


AGGATGATCC 


GATATGACGC 


AGTCGCAGAC 


CGTGACGGTG 


GATCAGCAAG 


240 


AGATTTTGAA 


CAGGGCCAAC 


GAGGTGGAGG 


CCCCGATGGC 


GGACCCACCG 


ACTGATGTCC 


300 


CCATCACACC 


GTGCGAACTC 


ACGGNGGNTA 


AAAACGCCGC 


CCAACAGNTG 


GTNTTGTCCG 


360 
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CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 4 20 

CGCTGCGCAA CGCGGCCAAG GNGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 4 80 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 54 0 

CGGCCGAACT AACCGATACG CCGAGGGTGG CCACGGCCGG TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAAGG CGCATCGCTC GCGCACTGNG 6 60 

GGGATGGGTG GAACACTTNC ACCCTGACGC TGCAAGGCGA CG 7 02 
(2) INFORMATION FOR SEQ ID NO : 4 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 4 : 

GAAGCCGCAG CGCTGTCGGG CGACGTGGCG GTCAAAGCGG CATCGCTCGG TGGCGGTGGA 60 

GGCGGCGGGG TGCCGTCGGC GCCGTTGGGA TCCGCGATCG GGGGCGCCGA ATCGGTGCGG 120 

CCCGCTGGCG CTGGTGACAT TGCCGGCTTA GGCCAGGGAA GGGCCGGCGG CGGCGCCGCG 180 

CTGGGCGGCG GTGGCATGGG AATGCCGATG GGTGCCGCGC ATCAGGGACA AGGGGGCGCC 24 0 

AAGTCCAAGG GTTCTCAGCA GGAAGACGAG GCGCTCTACA CCGAGGATCC TCGTGCCG 298 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



CGGCACGAGG 


ATCGAATCGC 


GTCGCCGGGA 


GCACAGCGTC 


GCACTGCACC 


AGTGGAGGAG 


60 


CCATGACCTA 


CTCGCCGGGT 


AACCCCGGAT 


ACCCGCAAGC 


GCAGCCCGCA 


GGCTCCTACG 


120 


GAGGCGTCAC 


ACCCTCGTTC 


GCCCACGCCG 


ATGAGGGTGC 


GAGCAAGCTA 


CCGATGTACC 


180 


TGAACATCGC 


GGTGGCAGTG 


CTCGGTCTGG 


CTGCGTACTT 


CGCCAGCTTC 


GGCCCAATGT 


240 


TCACCCTCAG 


TACCGAACTC 


GGGGGGGGTG 


ATGGCGCAGT 


GTCCGGTGAC 


ACTGGGCTGC 


300 


CGGTCGGGGT 


GGCTCTGCTG 


GCTGCGCTGC 


TTGCCGGGGT 


GGTTCTGGTG 


CCTAAGGCCA 


360 
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AGAGCCATGT GACGGTAGTT GCGGTGCTCG GGGTACTCGG CGTATTTCTG ATGGTCTCGG 
CGACGTTTAA CAAGCCCAGC GCCTATTCGA CCGGTTGGGC ATTGTGGGTT GTGTTGGCTT 
TCATCGTGTT CCAGGCGGTT GCGGCAGTCC TGGCGCTCTT GGTGGAGACC GGCGCTATCA 
CCGCGCCGGC GCCGCGGCCC AAGTTCGACC CGTATGGACA GTACGGGCGG TACGGGCAGT 
ACGGGCAGTA CGGGGTGCAG CCGGGTGGGT ACTACGGTCA GCAGGGTGCT CAGCAGGCCG 
CGGGACTGCA GTCGCCCGGC CCGCAGCAGT CTCCGCAGCC TCCCGGATAT GGGTCGCAGT 
ACGGCGGCTA TTCGTCCAGT CCGAGCCAAT CGGGCAGTGG ATACACTGCT CAGCCCCCGG 
CCCAGCCGCC GGCGCAGTCC GGGTCGCAAC AATCGCACCA GGGCCCATCC ACGCCACCTA 
CCGGCTTTCC GAGCTTCAGC CCACCACCAC CGGTCAGTGC CGGGACGGGG TCGCAGGCTG 
GTTCGGCTCC AGTCAACTAT TCAAACCCCA GCGGGGGCGA GCAGTCGTCG TCCCCCGGGG 
GGGCGCCGGT CTAACCGGGC GTTCCCGCGT CCGGTCGCGC GTGTGCGCGA AGAGTGAACA 1020 
GGGTGTCAGC AAGCGCGGAC GATCCTCGTG CCGAATTC 
(2) INFORMATION FOR SEQ ID NO : 4 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



420 

480 

540 

600 

660 

720 

780 

840 

900 

960 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 



CGGCACGAGA 


GACCGATGCC 


GCTACCCTCG 


CGCAGGAGGC 


AGGTAATTTC 


GAGCGGATCT 


60 


CCGGCGACCT 


GAAAACCCAG 


ATCGACCAGG 


TGGAGTCGAC 


GGCAGGTTCG 


TTGCAGGGCC 


120 


AGTGGCGCGG 


CGCGGCGGGG 


ACGGCCGCCC 


AGGCCGCGGT 


GGTGCGCTTC 


CAAGAAGCAG 


180 


CCAATAAGCA 


GAAGCAGGAA 


CTCGACGAGA 


TCTCGACGAA 


TATTCGTCAG 


GCCGGCGTCC 


240 


AATACTCGAG 


GGCCGACGAG 


GAGCAGCAGC 


AGGCGCTGTC 


CTCGCAAATG 


GGCTTCTGAC 


300 


CCGCTAATAC 


GAAAAGAAAC 


GGAGCAA 








327 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
CGGTCGCGAT GATGGCGTTG TCGAACGTGA CCGATTCTGT ACCGCCGTCG TTGAGATCAA 60 
CCAACAACGT GTTGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTGTTCT 120 
TCTTCATCAG GAAGTGCACA CCGGCCACCC TGCCCTCGGN TACCTTTCGG ]70 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
GATCCGGCGG CACGGGGGGT GCCGGCGGCA GCACCGCTGG CGCTGGCGGC AACGGCGGGG 60 
CCGGGGGTGG CGGCGGAACC GGTGGGTTGC TCTTCGGCAA CGGCGGTGCC GGCGGGCACG 120 
GGGCCGT 12 7 
(2) INFORMATION FOR SEQ ID NO : 4 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
CGGCGGCAAG GGCGGCACCG CCGGCAACGG GAGCGGCGCG GCCGGCGGCA ACGGCGGCAA 60 
CGGCGGCTCC GGCCTCAACG G 81 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 149 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GATCAGGGCT GGCCGGCTCC GGCCAGAAGG GCGGTAACGG AGGAGCTGCC GGATTGTTTG 60 
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GCAACGGCGG GGCCGGNGGT GCCGGCGCGT CCAACCAAGC CGGTAACGGC GGNGCCGGCG 120 
GAAACGGTGG TGCCGGTGGG CTGATCTGG 14 9 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CGGCACGAGA TCACACCTAC CGAGTGATCG AGATCGTCGG GACCTCGCCC GACGGTGTCG 60 

ACGCGGNAAT CCAGGGCGGT CTGGCCCGAG CTGCGCAGAC CATGCGCGCG CTGGACTGGT 120 

TCGAAGTACA GTCAATTCGA GGCCACCTGG TCGACGGAGC GGTCGCGCAC TTCCAGGTGA 180 

CTATGAAAGT CGGCTTCCGC CTGGAGGATT CCTGAACCTT CAAGCGCGGC CGATAACTGA 24 0 

GGTGCATCAT TAAGCGACTT TTCCAGAACA TCCTGACGCG CTCGAAACGC GGTTCAGCCG 300 

ACGGTGGCTC CGCCGAGGCG CTGCCTCCAA AATCCCTGCG ACAATTCGTC GGCGG 355 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



ATGCATCACC 


AT C AC CAT C A 


CATGCATCAG 


GTGGACCCCA 


ACTTGACACG 


TCGCAAGGGA 


60 


CGATTGGCGG 


CACTGGCTAT 


CGCGGCGATG 


GCCAGCGCCA 


GCCTGGTGAC 


CGTTGCGGTG 


120 


CCCGCGACCG 


CCAACGCCGA 


TCCGGAGCCA 


GCGCCCCCGG 


TACCCACAAC 


GGCCGCCTCG 


180 


CCGCCGTCGA 


CCGCTGCAGC 


GCCACCCGCA 


CCGGCGACAC 


CTGTTGCCCC 


CCCACCACCG 


240 


GCCGCCGCCA 


ACACGCCGAA 


TGCCCAGCCG 


GGCGATCCCA 


ACGCAGCACC 


TCCGCCGGCC 


300 


GACCCGAACG 


CACCGCCGCC 


ACCTGTCATT 


GCCCCAAACG 


CACCCCAACC 


TGTCCGGATC 


360 


GACAACCCGG 


TTGGAGGATT 


CAGCTTCGCG 


CTGCCTGCTG 


GCTGGGTGGA 


GTCTGACGCC 


420 


GCCCACTTCG 


ACTACGGTTC 


AGCACTCCTC 


AGCAAAACCA 


CCGGGGACCC 


GCCATTTCCC 


480 


GGACAGCCGC 


CGCCGGTGGC 


CAATGACACC 


CGTATCGTGC 


TCGGCCGGCT 


AGACCAAAAG 


540 
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CTTTACGCCA GCGCCGAAGC CACCGACTCC AAGGCCGCGG CCCGGTTGGG CTCGGACATG 600 

GGTGAGTTCT ATATGCCCTA CCCGGGCACC CGGATCAACC AGGAAACCGT CTCGCTCGAC 660 

GCCAACGGGG TGTCTGGAAG CGCGTCGTAT TACGAAGTCA AGTTCAGCGA TCCGAGTAAG 720 

CCGAACGGCC AGATCTGGAC GGGCGTAATC GGCTCGCCCG CGGCGAACGC ACCGGACGCC 7 80 

GGGCCCCCTC AGCGCTGGTT TGTGGTATGG CTCGGGACCG CCAACAACCC GGTGGACAAG 84 0 

GGCGCGGCCA AGGCGCTGGC CGAATCGATC CGGCCTTTGG TCGCCCCGCC GCCGGCGCCG 900 

GCACCGGCTC CTGCAGAGCC CGCTCCGGCG CCGGCGCCGG CCGGGGAAGT CGCTCCTACC 960 

CCGACGACAC CGACACCGCA GCGGACCTTA CCGGCCTGA 999 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Met His His His His His His Met His Gin Val Asp Pro Asn Leu Thr 
15 10 15 

Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala lie Ala Ala Met Ala Ser 
20 25 30 

Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro 
35 40 45 

Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr 
50 55 60 

Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro 
65 70 75 80 

Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala 
85 90 95 

Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val He Ala Pro 
100 105 HO 

Asn Ala Pro Gin Pro Val Arg He Asp Asn Pro Val Gly Gly Phe Ser 
115 120 125 

Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp 
130 135 140 

Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe Pro 
145 150 155 160 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg He Val Leu Gly Arg 
165 170 175 
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Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala 
180 185 ly 

Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 
195 200 20b 

Gly Thr Arg He Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val 



210 



Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys 

225 " 230 235 240 

Pro Asn Gly Gin He Trp Thr Gly Val He Gly Ser Pro Ala Ala Asn 

245 250 Z " 



Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly 
260 265 270 



Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu 

275 280 285 

Ser He Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro 

2 95 300 



290 



Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr 

3X0 315 ->^v 



305 

Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 
325 330 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Asp Pro Val Asp Ala Val He Asn Thr Thr Xaa Asn Tyr Gly Gl: 

. * c; 10 15 



Val Ala Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
1 5 10 15 

Glu Gly Arg 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 59: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val Pro 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 
1 5 10 15 



Ala 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
! 5 10 i5 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr Ser 
15 10 15 

Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 187 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Thr Gly Ser Leu Asn Gin Thr His Asn Arg Arg Ala Asn Glu Arg Lys 
15 10 15 

Asn Thr Thr Met Lys Met Val Lys Ser lie Ala Ala Gly Leu Thr Ala 
20 25 30 

Ala Ala Ala lie Gly Ala Ala Ala Ala Gly Val Thr Ser lie Met Ala 
35 40 45 

Gly Gly Pro Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro 
50 55 60 

Leu Pro Leu Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin 
65 70 75 80 

Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala 
85 90 95 

Asn Lys Gly Ser Leu Val Glu Gly Gly lie Gly Gly Thr Glu Ala Arg 
100 105 * 110 

lie Ala Asp His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro 
115 120 125 

Leu Ser Phe Ser Val Thr Asn lie Gin Pro Ala Ala Ala Gly Ser Ala 
130 135 140 

Thr Ala Asp Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr 
145 150 155 160 

Gin Asn Val Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala 
165 170 175 

Ser Ala Met Glu Leu Leu Gin Ala Ala Gly Xaa 
180 185 

(2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Asp Glu Val Thr Val Glu Thr Thr Ser Val Phe Arg Ala Asp Phe Leu 
1 5 10 15 

Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Ser 
20 25 30 

Gly Val Glu Gly Leu Pro Pro Gly Ser Ala Leu Leu Val Val Lys Arg 
35 ~ 40 45 

Gly Pro Asn Ala Gly Ser Arg Phe Leu Leu Asp Gin Ala He Thr Ser 
50 55 60 

Ala Glv Arq His Pro Asp Ser Asp He Phe Leu Asp Asp Val Thr Val 
65 10 75 80 

Ser Arq Arq His Ala Glu Phe Arg Leu Glu Asn Asn Glu Phe Asn Val 
85 90 95 

Val Asp Val Gly Ser Leu Asn Gly Thr Tyr Val Asn Arg Glu Pro Val 
100 105 HO 

Asp Ser Ala Val Leu Ala Asn Gly Asp Glu Val Gin He Gly Lys Leu 
115 120 125 

Arq Leu Val Phe Leu Thr Gly Pro Lys Gin Gly Glu Asp Asp Gly Ser 
130 135 140 

Thr Gly Gly Pro 
145 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Thr Ser Asn Arg Pro Ala Arg Arg Gly Arg Arg Ala Pro Arg Asp Thr 
! 5 10 15 

Gly Pro Asp Arg Ser Ala Ser Leu Ser Leu Val Arg His Arg Arg Gin 
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Gin Arg Asp Ala Leu Cys Leu Ser Ser Thr Gin lie Ser Arg Gin Ser 
35 4 0 4 5 

Asn Leu Pro Pro Ala Ala Gly Gly Ala Ala Asn Tyr Ser Arg Arg Asn 
50 55 60 

Phe Asp Val Arg lie Lys lie Phe Met Leu Val Thr Ala Val Val Leu 
65 70 75 80 

Leu Cys Cys Ser Gly Val Ala Thr Ala Ala Pro Lys Thr Tyr Cys Glu 
8 5 90 95 

Glu Leu Lys Gly Thr Asp Thr Gly Gin Ala Cys Gin lie Gin Met Ser 
100 105 110 

Asp Pro Ala Tyr Asn lie Asn lie Ser Leu Pro Ser Tyr Tyr Pro Asp 
115 120 125 

Gin Lys Ser Leu Glu Asn Tyr lie Ala Gin Thr Arg Asp Lys Phe Leu 
130 135 14 0 

Ser Ala Ala Thr Ser Ser Thr Pro Arg Glu Ala Pro Tyr Glu Leu Asn 
145 150 155 ' 160 

lie Thr Ser Ala Thr Tyr Gin Ser Ala lie Pro Pro Arg Gly Thr Gin 
165 170 " 175 

Ala Val Val Leu Xaa Val Tyr His Asn Ala Gly Gly Thr His Pro Thr 
180 185 ' 190 

Thr Thr Tyr Lys Ala Phe Asp Trp Asp Gin Ala Tyr Arg Lys Pro lie 
195 200 205 

Thr Tyr Asp Thr Leu Trp Gin Ala Asp Thr Asp Pro Leu Pro Val Val 
210 215 220 

Phe Pro lie Val Ala Arg 
225 230 

(2). INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe 
1 5 10 15 

Ala lie Pro lie Gly Gin Ala Met Ala lie Ala Gly Gin lie Arg Ser 
20 25 30 

Gly Gly Gly Ser Pro Thr Val His lie Gly Pro Thr Ala Phe Leu Gly 
35 4 0 4 5 
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Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val 
50 " 60 

Val Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val 
65 70 75 80 

He Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala 
85 90 9b 

Asp Ala Leu Asn Gly His His Pro Gly Asp Val He Ser Val Asn Trp 
100 105 no 

Gin Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu 
115 120 I 25 

Gly Pro Pro Ala 
130 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Val Pro Leu Arg Ser Pro Ser Met Ser Pro Ser Lys Cys Leu Ala Ala 
1 5 1° llD 

Ala Gin Arg Asn Pro Val He Arg Arg Arg Arg Leu Ser Asn Pro Pro 
20 25 30 

Pro Arg Lys Tyr Arg Ser Met Pro Ser Pro Ala Thr Ala Ser Ala Gly 
35 40 4 

Met Ala Arg Val Arg Arg Arg Ala lie Trp Arg Gly Pro Ala Thr Xaa 
50 55 60 

Ser Ala Gly Met Ala Arg Val Arg Arg Trp Xaa Val Met Pro Xaa Val 
65 ' 70 75 BO 

lie Gin Ser Thr Xaa He Arg Xaa Xaa Gly Pro Phe Asp Asn Arg Gly 
85 90 yD 



Ser Glu Arg Lys 
100 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Met Thr Asp Asp lie Leu Leu lie Asp Thr Asp Glu Arg Val Arg Thr 
15 10 15 

Leu Thr Leu Asn Arg Pro Gin Ser Arg Asn Ala Leu Ser Ala Ala Leu 
20 25 30 

Arg Asp Arg Phe Phe Ala Xaa Leu Xaa Asp Ala Glu Xaa Asp Asp Asp 
35 4 0 4 5 

lie Asp Val Val lie Leu Thr Gly Ala Asp Pro Val Phe Cys Ala Gly 
50 55 60 

Leu Asp Leu Lys Val Ala Gly Arg Ala Asp Arg Ala Ala Gly His Leu 
65 70 75 " 80 

Thr Ala Val Gly Gly His Asp Gin Ala Gly Asp Arg Arg Asp Gin Arg 
8 5 90 95 

Arg Arg Gly His Arg Arg Ala Arg Thr Gly Ala Val Leu Arg His Pro 
100 105 no 

Asp Arg Leu Arg Ala Arg Pro Leu Arg Arg His Pro Arg Pro Gly Gly 
115 120 125 

Ala Ala Ala His Leu Gly Thr Gin Cys Val Leu Ala Ala Lys Gly Arg 
130 135 140 

His Arg Xaa Gly Pro Val Asp Glu Pro Asp Arg Arg Leu Pro Val Arg 
145 150 155 " 160 

Asp Arg Arg 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Met Lys Phe Val Asn His He Glu Pro Val Ala Pro Arg Arg Ala Gly 
1 5 10 15 

Gly Ala Val Ala Glu Val Tyr Ala Glu Ala Arg Arg Glu Phe Gly Arg 

20 25 30 

Leu Pro Glu Pro Leu Ala Met Leu Ser Pro Asp Glu Gly Leu Leu Thr 
35 40 45 
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Ala Glv Trp Ala Thr Leu Arg Glu Thr Leu Leu Val Gly Gin Val Pro 
50 55 60 

Arg Gly Arg Lys Glu Ala Val Ala Ala Ala Val Ala Ala Ser Leu Arg 
65 70 75 80 

Cys Pro Trp Cys Val Asp Ala His Thr Thr Met Leu Tyr Ala Ala Gly 
y 85 9° 95 

Gin Thr Asp Thr Ala Ala Ala lie Leu Ala Gly Thr Ala Pro Ala Ala 
100 105 HO 

Gly Asp Pro Asn Ala Pro Tyr Val Ala Trp Ala Ala Gly Thr Gly Thr 
115 120 125 

Pro Ala Gly Pro Pro Ala Pro Phe Gly Pro Asp Val Ala Ala Glu Tyr 
130 " 135 140 

Leu Gly Thr Ala Val Gin Phe His Phe He Ala Arg Leu Val Leu Val 
145 " 150 155 IbU 

Leu Leu Asp Glu Thr Phe Leu Pro Gly Gly Pro Arg Ala Gin Gin Leu 
165 1™ I 75 

Met Arg Arg Ala Gly Gly Leu Val Phe Ala Arg Lys Val Arg Ala Glu 
180 185 190 

His Arg Pro Gly Arg Ser Thr Arg Arg Leu Glu Pro Arg Thr Leu Pro 
195 200 205 

Asp Asp Leu Ala Trp Ala Thr Pro Ser Glu Pro He Ala Thr Ala Phe 
210 " 215 220 

Ala Ala Leu Ser His His Leu Asp Thr Ala Pro His Leu Pro Pro Pro 
225 230 235 240 

Thr Arg Gin Val Val Arg Arg Val Val Gly Ser Trp His Gly Glu Pro 
245 250 255 

Met Pro Met Ser Ser Arg Trp Thr Asn Glu His Thr Ala Gin Leu Pro 
260 265 270 

Ala Asp Leu His Ala Pro Thr Arg Leu Ala Leu Leu Thr Gly Leu Ala 
275 280 285 

Pro His Gin Val Thr Asp Asp Asp Val Ala Ala Ala Arg Ser Leu Leu 
290 295 300 

Asp Thr Asp Ala Ala Leu Val Gly Ala Leu Ala Trp Ala Ala Phe Thr 
305 " 310 315 320 

Ala Ala Arg Arg He Gly Thr Trp lie Gly Ala Ala Ala Glu Gly Gin 
325 330 3 3~> 

Val Ser Arg Gin Asn Pro Thr Gly 
340 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 85 amino acids 

(B) TYPE: amino acid 
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<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Asp Asp Pro Asp Met Pro Gly Thr Val Ala Lys Ala Val Ala Asp Ala 
1 5 10 15 

Leu Gly Arg Gly lie Ala Pro Val Glu Asp lie Gin Asp Cys Val Glu 
20 25 30 

Ala Arg Leu Gly Glu Ala Gly Leu Asp Asp Val Ala Arg Val Tyr lie 
35 40 45 

lie Tyr Arg Gin Arg Arg Ala Glu Leu Arg Thr Ala Lys Ala Leu Leu 
50 55 60 

Gly Val Arg Asp Glu Leu Lys Leu Ser Leu Ala Ala Val Thr Val Leu 
65 70 75 80 

Arg Glu Arg Tyr Leu Leu His Asp Glu Gin Gly Arg Pro Ala Glu Ser 
85 90 95 

Thr Gly Glu Leu Met Asp Arg Ser Ala Arg Cys Val Ala Ala Ala Glu 
100. 105 HO 

Asp Gin Tyr Glu Pro Gly Ser Ser Arg Arg Trp Ala Glu Arg Phe Ala 
115 120 125 

Thr Leu Leu Arg Asn Leu Glu Phe Leu Pro Asn Ser Pro Thr Leu Met 
130 135 140 

Asn Ser Gly Thr Asp Leu Gly Leu Leu Ala Gly Cys Phe Val Leu Pro 
145 150 155 160 

lie Glu Asp Ser Leu Gin Ser lie Phe Ala Thr Leu Gly Gin Ala Ala 
165 170 175 

Glu Leu Gin Arg Ala Gly Gly Gly Thr Gly Tyr Ala Phe Ser His Leu 
180 185 190 

Arg Pro Ala Gly Asp Arg Val Ala Ser Thr Gly Gly Thr Ala Ser Gly 
195 200 205 

Pro Val Ser Phe Leu Arg Leu Tyr Asp Ser Ala Ala Gly Val Val Ser 
210 215 220 

Met Gly Gly Arg Arg Arg Gly Ala Cys Met Ala Val Leu Asp Val Ser 
225 230 235 240 

His Pro Asp lie Cys Asp Phe Val Thr Ala Lys Ala Glu Ser Pro Ser 
245 250 ^ 255 

Glu Leu Pro His Phe Asn Leu Ser Val Gly Val Thr Asp Ala Phe Leu 
260 265 270 

Arg Ala Val Glu Arg Asn Gly Leu His Arg Leu Val Asn Pro Arg Thr 
275 280 285 
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Glv Lvs lie Val Ala Arg Met Pro Ala Ala Glu Leu Phe Asp Ala lie 
290 295 300 

Cys Lys Ala Ala His Ala Gly Gly Asp Pro Gly Leu Val Phe Leu Asp 
305 310 

Thr lie Asn Arg Ala Asn Pro Val Pro Gly Arg Gly Arg He Glu Ala 

325 330 J 

Thr Asn Pro Cys Gly Glu Val Pro Leu Leu Pro Tyr Glu Ser Cys Asn 



340 



Leu Gly Ser He Asn Leu Ala Arg Met Leu Ala Asp Gly Arg Val Asp 
355 360 365 

Trp Asp Arg Leu Glu Glu Val Ala Gly Val Ala Val Arg Phe Leu Asp 

375 380 



370 



Asp Val lie Asp Val Ser Arg Tyr Pro Phe Pro Glu Leu Gly Glu Ala 

390 395 q vv 



385 



Ala Arg Ala Thr Arg Lys He Gly Leu Gly Val Met Gly Leu Ala Glu 
405 410 41b 

Leu Leu Ala Ala Leu Gly He Pro Tyr Asp Ser Glu Glu Ala Val Arg 
420 425 430 

Leu Ala Thr Arg Leu Met Arg Arg He Gin Gin Ala Ala His Thr Ala 
435 440 445 

Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ala Phe Thr Asp 

450 4 " 460 

Ser Arg Phe Ala Arg Ser Gly Pro Arg Arg Asn Ala Gin Val Thr Ser 

465 470 475 

Val Ala Pro Thr Gly 
485 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Gly Val lie Val Leu Asp Leu Glu Pro Arg Gly Pro Leu Pro Thr Glu 
1 5 10 1D 

lie Tyr Trp Arg Arg Arg Gly Leu Ala Leu Gly He Ala Val Val Val 
20 25 

Val Gly He Ala Val Ala lie Val He Ala Phe Val Asp Ser Ser Ala 
35 40 45 
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Gly Ala Lys Pro Val Ser Ala Asp Lys Pro Ala Ser Ala Gin Ser His 
50 55 60 

Pro Gly Ser Pro Ala Pro Gin Ala Pro Gin Pro Ala Gly Gin Thr Glu 
65 70 75 80 

Gly Asn Ala Ala Ala Ala Pro Pro Gin Gly Gin Asn Pro Glu Thr Pro 
85 90 95 

Thr Pro Thr Ala Ala Val Gin Pro Pro Pro Val Leu Lys" Glu Gly Asp 
100 105 no 

Asp Cys Pro Asp Ser Thr Leu Ala Val Lys Gly Leu Thr Asn Ala Pro 
115 120 " 125 

Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met Val Val Thr Asn 
130 135 140 

lie Gly Leu Val Ser Cys Lys Arg Asp Val Gly Ala Ala Val Leu Ala 
145 150 155 160 

Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg Leu Trp Ser Asn Leu Asp 
165 170 ' 175 

Cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly Glu 
180 185 190 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 
195 200 205 

Cys Pro Leu Pro Arg Pro Ala He Gly Pro Gly Thr Tyr Asn Leu Val 
210 215 220 

Val Gin Leu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe He Leu Asn 
225 230 235 240 

Gin Pro Pro Pro Pro Pro Gly Pro Val Pro Ala Pro Gly Pro Ala Gin 
245 250 255 

Ala Pro Pro Pro Glu Ser Pro Ala Gin Gly Gly 
260 265 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Leu He Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly Val Gin Val 
15 10 15 

Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys He Val Glu Val Val Ala 
20 25 30 
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GCAACGGCGG GGCCGGNGGT GCCGGCGCGT CCAACCAAGC CGGTAACGGC GGNGCCGGCG 120 
GAAACGGTGG TGCCGGTGGG CTGATCTGG 
(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 



CGGCACGAGA 


TCACACCTAC 


CGAGTGATCG 


AGATCGTCGG 


GACCTCGCCC 


GACGGTGTCG 


60 


ACGCGGNAAT 


CCAGGGCGGT 


CTGGCCCGAG 


CTGCGCAGAC 


CATGCGCGCG 


CTGGACTGGT 


120 


TCGAAGTACA 


GTCAATTCGA 


GGCCACCTGG 


TCGACGGAGC 


GGTCGCGCAC 


TTCCAGGTGA 


180 


CTATGAAAGT 


CGGCTTCCGC 


CTGGAGGATT 


CCTGAACCTT 


CAAGCGCGGC 


CGATAACTGA 


240 


GGTGCATCAT 


TAAGCGACTT 


TTCCAGAACA 


TCCTGACGCG 


CTCGAAACGC 


GGTTCAGCCG 


300 


ACGGTGGCTC 


CGCCGAGGCG 


CTGCCTCCAA 


AATCCCTGCG 


ACAATTCGTC 


GGCGG 


355 


(2) INFORMATION FOR SEQ ID NO: 52 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



ATGCATCACC 


ATCACCATCA 


CATGCATCAG 


GTGGACCCCA 


ACTTGACACG 


TCGCAAGGGA 


60 


CGATTGGCGG 


CACTGGCTAT 


CGCGGCGATG 


GCCAGCGCCA 


GCCTGGTGAC 


CGTTGCGGTG 


120 


CCCGCGACCG 


CCAACGCCGA 


TCCGGAGCCA 


GCGCCCCCGG 


TACCCACAAC 


GGCCGCCTCG 


180 


CCGCCGTCGA 


CCGCTGCAGC 


GCCACCCGCA 


CCGGCGACAC 


CTGTTGCCCC 


CCCACCACCG 


240 


GCCGCCGCCA 


ACACGCCGAA 


TGCCCAGCCG 


GGCGATCCCA 


ACGCAGCACC 


TCCGCCGGCC 


300 


GACCCGAACG 


CACCGCCGCC 


ACCTGTCATT 


GCCCCAAACG 


CACCCCAACC 


TGTCCGGATC 


360 


GACAACCCGG 


TTGGAGGATT 


CAGCTTCGCG 


CTGCCTGCTG 


GCTGGGTGGA 


GTCTGACGCC 


420 


GCCCACTTCG 


ACTACGGTTC 


AGCACTCCTC 


AGCAAAACCA 


CCGGGGACCC 


GCCATTTCCC 


480 


GGACAGCCGC 


CGCCGGTGGC 


CAATGACACC 


CGTATCGTGC 


TCGGCCGGCT 


AGACCAAAAG 


540 
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CTTTACGCCA GCGCCGAAGC CACCGACTCC AAGGCCGCGG CCCGGTTGGG CTCGGACATG 600 

GGTGAGTTCT ATATGCCCTA CCCGGGCACC CGGATCAACC AGGAAACCGT CTCGCTCGAC 660 

GCCAACGGGG TGTCTGGAAG CGCGTCGTAT TACGAAGTCA AGTTCAGCGA TCCGAGTAAG 7 20 

CCGAACGGCC AGATCTGGAC GGGCGTAATC GGCTCGCCCG CGGCGAACGC ACCGGACGCC 780 

GGGCCCCCTC AGCGCTGGTT TGTGGTATGG CTCGGGACCG CCAACAACCC GGTGGACAAG 84 0 

GGCGCGGCCA AGGCGCTGGC CGAATCGATC CGGCCTTTGG TCGCCCCGCC GCCGGCGCCG 900 

GCACCGGCTC CTGCAGAGCC CGCTCCGGCG CCGGCGCCGG CCGGGGAAGT CGCTCCTACC 960 

CCGACGACAC CGACACCGCA GCGGACCTTA CCGGCCTGA 99 9 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Met His His His His His His Met His Gin Val Asp Pro Asn Leu Thr 
15 10 15 

Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala lie Ala Ala Met Ala Ser 
20 25 30 

Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro 
35 40 45 

Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr 
50 55 60 

Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro 
65 70 75 80 

Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala 
85 90 95 

Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val lie Ala Pro 
100 105 110 

Asn Ala Pro Gin Pro Val Arg lie Asp Asn Pro Val Gly Gly Phe Ser 
115 120 125 

Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp 
130 135 140 

Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe Pro 
145 150 155 * 160 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg lie Val Leu Gly Arg 
165 170 175 
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Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala 
180 185 ■ Lyu 

Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 
195 200 20b 

Glv Thr Arg He Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val 
210 215 220 

Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys 

225 230 235 

Pro Asn Gly Gin He Trp Thr Gly Val lie Gly Ser Pro Ala Ala Asn 

- 250 



245 



Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly 
260 265 Z/U 

Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu 
275 280 285 

Ser He Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro 
2 90 295 300 

Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr 
305 310 315 320 

Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 
325 330 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Asp Pro Val Asp Ala Val He Asn Thr Thr Xaa Asn Tyr Gly Gl: 
1 5 10 15 

Val Ala Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
15 10 15 

Glu Gly Arg 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 59: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 59: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 
1 5 10 lb 



Ala 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
1 5 10 15 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr Ser 
15 10 15 

Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 187 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Thr Gly Ser Leu Asn Gin Thr His Asn Arg Arg Ala Asn Glu Arq Lvs 
1 5 10 15 

Asn Thr Thr Met Lys Met Val Lys Ser He Ala Ala Gly Leu Thr Ala 
20 25 30 

Ala Ala Ala He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala 
35 40 4 5 

Gly Gly Pro Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro 
50 55 60 

Leu Pro Leu Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin 
65 70 75 80 

Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala 
85 90 95 

Asn Lys Gly Ser Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg 
100 105 HO 

lie Ala Asp His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro 
115 120 125 

Leu Ser Phe Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala 
130 135 140 

Thr Ala Asp Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr 
1^5 150 " 155 160 

Gin Asn Val Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala 
165 170 * 175 

Ser Ala Met Glu Leu Leu Gin Ala Ala Gly Xaa 
180 185 

(2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Asp Glu Val Thr Val Glu Thr Thr Ser Val Phe Arg Ala Asp Phe Leu 
1 5 10 15 

Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Ser 
20 25 30 

Glv Val Glu Gly Leu Pro Pro Gly Ser Ala Leu Leu Val Val Lys Arg 
35 ^O 45 

Glv Pro Asn Ala Gly Ser Arg Phe Leu Leu Asp Gin Ala He Thr Ser 
50 55 60 

Ala Gly Arg His Pro Asp Ser Asp lie Phe Leu Asp Asp Val Thr Val 
65 70 75 80 

Ser Arq Arg His Ala Glu Phe Arg Leu Glu Asn Asn Glu Phe Asn Val 
85 90 95 

Val Asp Val Gly Ser Leu Asn Gly Thr Tyr Val Asn Arg Glu Pro Val 
100 105 110 

Asp Ser Ala Val Leu Ala Asn Gly Asp Glu Val Gin He Gly Lys Leu 
115 120 125 

Arq Leu Val Phe Leu Thr Gly Pro Lys Gin Gly Glu Asp Asp Gly Ser 
130 135 "0 

Thr Gly Gly Pro 
145 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Thr Ser Asn Arg Pro Ala Arg Arg Gly Arg Arg Ala Pro Arg Asp Thr 
1 5 10 15 

Gly Pro Asp Arg Ser Ala Ser Leu Ser Leu Val Arg His Arg Arg Gin 
20 25 3 
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Gin Arg Asp Ala Leu Cys Leu Ser Ser Thr Gin He Ser Arg Gin Ser 
35 40 45 

Asn Leu Pro Pro Ala Ala Gly Gly Ala Ala Asn Tyr Ser Arg Arg Asn 
50 55 " 60 

Phe Asp Val Arg He Lys He Phe Met Leu Val Thr Ala Val Val Leu 
65 70 75 80 

Leu Cys Cys Ser Gly Val Ala Thr Ala Ala Pro Lys Thr Tyr Cys Glu 
85 90 95 

Glu Leu Lys Gly Thr Asp Thr Gly Gin Ala Cys Gin He Gin Met Ser 
100 105 110 

Asp Pro Ala Tyr Asn He Asn He Ser Leu Pro Ser Tyr Tyr Pro Asp 
115 120 125 

Gin Lys Ser Leu Glu Asn Tyr lie Ala Gin Thr Arg Asp Lys Phe Leu 
130 135 140 

Ser Ala Ala Thr Ser Ser Thr Pro Arg Glu Ala Pro Tyr Glu Leu Asn 
145 150 155 ^ 160 

He Thr Ser Ala Thr Tyr Gin Ser Ala He Pro Pro Arg Gly Thr Gin 
165 170 175 

Ala Val Val Leu Xaa Val Tyr His Asn Ala Gly Gly Thr His Pro Thr 
180 185 190 

Thr Thr Tyr Lys Ala Phe Asp Trp Asp Gin Ala Tyr Arg Lys Pro He 
195 200 205 

Thr Tyr Asp Thr Leu Trp Gin Ala Asp Thr Asp Pro Leu Pro Val Val 
210 215 220 

Phe Pro He Val Ala Arg 
225 230 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe 
1 5 10 15 

Ala lie Pro He Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser 
20 25 ' 30 

Gly Gly Gly Ser Pro Thr Val His lie Gly Pro Thr Ala Phe Leu Gly 
35 40 45 
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Leu Gly Val Val 
50 

Val Gly Ser Ala 
65 

lie Thr Ala Val 



Asp Ala Leu Asn 
100 



Gin Thr Lys Ser 
115 

Gly Pro Pro Ala 
130 



Asp Asn Asn Gly 
55 

Pro Ala Ala Ser 
70 

Asp Gly Ala Pro 
85 

Gly His His Pro 



Gly Gly Thr Arg 
120 



Asn Gly Ala Arg 
60 



Leu Gly He Ser 
75 

He Asn Ser Ala 
90 

Gly Asp Val He 
105 

Thr Gly Asn Val 



Val Gin Arg Val 



Thr Gly Asp Val 
80 

Thr Ala Met Ala 
95 

Ser Val Asn Trp 
110 

Thr Leu Ala Glu 
125 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Val Pro Leu Arg Ser Pro Ser Met Ser Pro Ser Lys Cys Leu Ala Ala 
1 5 10 10 

Ala Gin Arg Asn Pro Val He Arg Arg Arg Arg Leu Ser Asn Pro Pro 
20 25 ^ U 

Pro Arg Lys Tyr Arg Ser Met Pro Ser Pro Ala Thr Ala Ser Ala Gly 
35 40 



Met Ala Arg Val Arg Arg Arg Ala He Trp Arg Gly Pro Ala Thr Xaa 
50 " 55 60 

r Ala Gly Met Ala Arg Val Arg Arg Trp Xaa Val Met Pro Xaa Val . 

70 7 5 ou 



Se 
65 



lie Gin Ser Thr Xaa He Arg Xaa Xaa Gly Pro Phe Asp Asn Arg Gly 
85 90 



Ser Glu Arg Lys 
100 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Met Thr Asp Asp lie Leu Leu lie Asp Thr Asp Glu Arg Val Arg Thr 
15 10 " 15 

Leu Thr Leu Asn Arg Pro Gin Ser Arg Asn Ala Leu Ser Ala Ala Leu 
20 25 30 

Arg Asp Arg Phe Phe Ala Xaa Leu Xaa Asp Ala Glu Xaa Asp Asp Asp 
35 4 0 4 5 

lie Asp Val Val lie Leu Thr Gly Ala Asp Pro Val Phe Cys Ala Gly 
50 55 60 

Leu Asp Leu Lys Val Ala Gly Arg Ala Asp Arg Ala Ala Gly His Leu 
65 70 75 ' 80 

Thr Ala Val Gly Gly His Asp Gin Ala Gly Asp Arg Arg Asp Gin Arg 
8 5 90 95 

Arg Arg Gly His Arg Arg Ala Arg Thr Gly Ala Val Leu Arg His Pro 
100 105 HO 

Asp Arg Leu Arg Ala Arg Pro Leu Arg Arg His Pro Arg Pro Gly Gly 
115 120 125 

Ala Ala Ala His Leu Gly Thr Gin Cys Val Leu Ala Ala Lys Gly Arg 
130 135 140 

His Arg Xaa Gly Pro Val Asp Glu Pro Asp Arg Arg Leu Pro Val Arg 
145 150 155 ~ 160 

Asp Arg Arg 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Met Lys Phe Val Asn His lie Glu Pro Val Ala Pro Arg Arg Ala Gly 

15 10 15 

Gly Ala Val Ala Glu Val Tyr Ala Glu Ala Arg Arg Glu Phe Gly Arg 

20 25 30 

Leu Pro Glu Pro Leu Ala Met Leu Ser Pro Asp Glu Gly Leu Leu Thr 
35 40 45 
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Ala Gly Trp Ala Thr Leu Arg Glu Thr Leu Leu Val Gly Gin Val Pro 
50 ' 55 60 

Arg Gly Arg Lys Glu Ala Val Ala Ala Ala Val Ala Ala Ser Leu Arg 

65 70 1 5 80 

Cvs Pro Trp Cys Val Asp Ala His Thr Thr Met Leu Tyr Ala Ala Gly 
1 or- on 95 



85 



Gin Thr Asp Thr Ala Ala Ala lie Leu Ala Gly Thr Ala Pro Ala Ala 
100 105 HO 

Gly Asp Pro Asn Ala Pro Tyr Val Ala Trp Ala Ala Gly Thr Gly Thr 
- - - 120 125 



115 



Pro Ala Gly Pro Pro Ala Pro Phe Gly Pro Asp Val Ala Ala Glu Tyr 
130 ~ 135 1^0 

Leu Gly Thr Ala Val Gin Phe His Phe He Ala Arg Leu Val Leu Val 
145 150 155 160 

Leu Leu Asp Glu Thr Phe Leu Pro Gly Gly Pro Arg Ala Gin Gin Leu 
165 170 175 

Met Arq Arq Ala Gly Gly Leu Val Phe Ala Arg Lys Val Arg Ala Glu 
180 185 190 

His Arq Pro Gly Arg Ser Thr Arg Arg Leu Glu Pro Arg Thr Leu Pro 
195 ~ 200 205 

Asp Asp Leu Ala Trp Ala Thr Pro Ser Glu Pro He Ala Thr Ala Phe 
210 215 220 

Ala Ala Leu Ser His His Leu Asp Thr Ala Pro His Leu Pro Pro Pro 
225 230 235 240 

Thr Arg Gin Val Val Arg Arg Val Val Gly Ser Trp His Gly Glu Pro 
245 250 255 

Met Pro Met Ser Ser Arg Trp Thr Asn Glu His Thr Ala Glu Leu Pro 
2 60 2 65 27 0 

Ala Asp Leu His Ala Pro Thr Arg Leu Ala Leu Leu Thr Gly Leu Ala 
275 280 285 

Pro His Gin Val Thr Asp Asp Asp Val Ala Ala Ala Arg Ser Leu Leu 
290 295 300 

Asp Thr Asp Ala Ala Leu Val Gly Ala Leu Ala Trp Ala Ala Phe Thr 
305 " 310 315 320 

Ala Ala Arg Arg He Gly Thr Trp He Gly Ala Ala Ala Glu Gly Gin 
325 330 335 

Val Ser Arg Gin Asn Pro Thr Gly 
340 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 485 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Asp Asp Pro Asp Met Pro Gly Thr Val Ala Lys Ala Val Ala Asp Ala 
15 10 15 

Leu Gly Arg Gly lie Ala Pro Val Glu Asp lie Gin Asp Cys Val Glu 
20 25 30 

Ala Arg Leu Gly Glu Ala Gly Leu Asp Asp Val Ala Arg Val Tyr lie 
35 4 0 4 5 

lie Tyr Arg Gin Arg Arg Ala Glu Leu Arg Thr Ala Lys Ala Leu Leu 
50 55 60 

Gly Val Arg Asp Glu Leu Lys Leu Ser Leu Ala Ala Val Thr Val Leu 
65 70 75 80 

Arg Glu Arg Tyr Leu Leu His Asp Glu Gin Gly Arg Pro Ala Glu Ser 
85 90 ~ 95 

Thr Gly Glu Leu Met Asp Arg Ser Ala Arg Cys Val Ala Ala Ala Glu 
100 105 no 

Asp Gin Tyr Glu Pro Gly Ser Ser Arg Arg Trp Ala Glu Arg Phe Ala 
115 120 125 

Thr Leu Leu Arg Asn Leu Glu Phe Leu Pro Asn Ser Pro Thr Leu Met 
130 135 140 

Asn Ser Gly Thr Asp Leu Gly Leu Leu Ala Gly Cys Phe Val Leu Pro 
145 150 155 160 

lie Glu Asp Ser Leu Gin Ser lie Phe Ala Thr Leu Gly Gin Ala Ala 
165 170 175 

Glu Leu Gin Arg Ala Gly Gly Gly Thr Gly Tyr Ala Phe Ser His Leu 
180 185 190 

Arg Pro Ala Gly Asp Arg Val Ala Ser Thr Gly Gly Thr Ala Ser Gly 
195 200 " 205 

Pro Val Ser Phe Leu Arg Leu Tyr Asp Ser Ala Ala Gly Val Val Ser 
210 215 220 

Met Gly Gly Arg Arg Arg Gly Ala Cys Met Ala Val Leu Asp Val Ser 
225 230 235 ' 240 

His Pro Asp lie Cys Asp Phe Val Thr Ala Lys Ala Glu Ser Pro Ser 
245 250 255 

Glu Leu Pro His Phe Asn Leu Ser Val Gly Val Thr Asp Ala Phe Leu 
260 265 " 270 

Arg Ala Val Glu Arg Asn Gly Leu His Arg Leu Val Asn Pro Arg Thr 
275 280 285 
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Gly Lys He Val Ala Arg Met Pro Ala Ala Glu Leu Phe Asp Ala He 

295 300 



290 



Cys Lys Ala Ala His Ala Gly Gly Asp Pro Gly Leu Val Phe Leu Asp 
->ir 310 315 



305 



Thr He Asn Arg Ala Asn Pro Val Pro Gly Arg Gly Arg He Glu Ala 
325 330 33d 

Thr Asn Pro Cys Gly Glu Val Pro Leu Leu Pro Tyr Glu Ser Cys Asn 
340 345 350 

Leu Gly Ser He Asn Leu Ala Arg Met Leu Ala Asp Gly Arg Val Asp 
355 360 365 

Trp Asp Arg Leu Glu Glu Val Ala Gly Val Ala Val Arg Phe Leu Asp 
370 375 380 

Asp Val He Asp Val Ser Arg Tyr Pro Phe Pro Glu Leu Gly Glu Ala 
- ~ - 390 395 q{J u 



385 



Ala Arg Ala Thr Arg Lys He Gly Leu Gly Val Met Gly Leu Ala Glu 
405 410 415 

Leu Leu Ala Ala Leu Gly He Pro Tyr Asp Ser Glu Glu Ala Val Arg 
420 425 430 

Leu Ala Thr Arg Leu Met Arg Arg He Gin Gin Ala Ala His Thr Ala 
435 440 445 

Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ala Phe Thr Asp 
4 50 4 55 4 60 

Ser Arg Phe Ala Arg Ser Gly Pro Arg Arg Asn Ala Gin Val Thr Ser 
465 470 475 480 

Val Ala Pro Thr Gly 
485 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Gly Val He Val Leu Asp Leu Glu Pro Arg Gly Pro Leu Pro Thr Glu 
1 5 10 15 

He Tyr Trp Arg Arg Arg Gly Leu Ala Leu Gly He Ala Val Val Val 
20 25 3 

Val Gly He Ala Val Ala He Val He Ala Phe Val Asp Ser Ser Ala 
35 40 45 
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Gly Ala Lys Pro Val Ser Ala Asp Lys Pro Ala Ser Ala Gin Ser His 
50 55 60 

Pro Gly Ser Pro Ala Pro Gin Ala Pro Gin Pro Ala Gly Gin Thr Glu 
65 70 75 80 

Gly Asn Ala Ala Ala Ala Pro Pro Gin Gly Gin Asn Pro Glu Thr Pro 
85 90 95 

Thr Pro Thr Ala Ala Val Gin Pro Pro Pro Val Leu Lys Glu Gly Asp 
100 105 110 

Asp Cys Pro Asp Ser Thr Leu Ala Val Lys Gly Leu Thr Asn Ala Pro 
115 120 125 

Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met Val Val Thr Asn 
130 135 140 

lie Gly Leu Val Ser Cys Lys Arg Asp Val Gly Ala Ala Val Leu Ala 
145 150 155 160 

Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg Leu Trp Ser Asn Leu Asp 
165 170 175 

Cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly Glu 
180 185 190 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 
195 200 " 205 

Cys Pro Leu Pro Arg Pro Ala lie Gly Pro Gly Thr Tyr Asn Leu Val 
210 215 220 

Val Gin Leu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe lie Leu Asn 
225 230 235 240 

Gin Pro Pro Pro Pro Pro Gly Pro Val Pro Ala Pro Gly Pro Ala Gin 
245 250 255 

Ala Pro Pro Pro Glu Ser Pro Ala Gin Gly Gly 
2 60 2 65 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Leu lie Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly Val Gin Val 
1 5 10 15 

Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys lie Val Glu Val Val Ala 
20 25 30 
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Gin 



(2) INFORMATION FOR SEQ ID NO:73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 364 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Glv Ala Ala Val Ser Leu Leu Ala Ala Gly Thr Leu Val Leu Thr Ala 
1 5 10 15 

Cys Gly Gly Gly Thr Asn Ser Ser Ser Ser Gly Ala Gly Gly Thr Ser 
20 25 30 

Gly Ser Val His Cys Gly Gly Lys Lys Glu Leu His Ser Ser Gly Ser 
35 40 45 

Thr Ala Gin Glu Asn Ala Met Glu Gin Phe Val Tyr Ala Tyr Val Arg 
50 55 60 

Ser Cys Pro Gly Tyr Thr Leu Asp Tyr Asn Ala Asn Gly Ser Gly Ala 
65 70 75 80 

Glv Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asp 
85 90 95 

Val Pro Leu Asn Pro Ser Thr Gly Gin Pro Asp Arg Ser Ala Glu Arg 
100 105 HO 

Cvs Gly Ser Pro Ala Trp Asp Leu Pro Thr Val Phe Gly Pro He Ala 
115 120 125 

He Thr Tyr Asn He Lys Gly Val Ser Thr Leu Asn Leu Asp Gly Pro 
130 135 140 

Thr Thr Ala Lys He Phe Asn Gly Thr He Thr Val Trp Asn Asp Pro 
145 150 155 160 

Gin He Gin Ala Leu Asn Ser Gly Thr Asp Leu Pro Pro Thr Pro He 
165 170 175 
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Ser Val^Ile Phe Arg Ser Asp Lys Ser Gly Thr Ser Asp Asn Phe Gin 
180 185 190 

Lys Tyr Leu Asp Gly Val Ser Asn Gly Ala Trp Gly Lys Gly Ala Ser 
195 200 205 

Glu Thr Phe Ser Gly Gly Val Gly Val Gly Ala Ser Gly Asn Asn Gly 
210 215 220 

Thr Ser Ala Leu Leu Gin Thr Thr Asp Gly Ser He Thr Tyr Asn Glu 
225 230 235 240 

Trp Ser Phe Ala Val Gly Lys Gin Leu Asn Met Ala Gin He He Thr 
245 250 255 

Ser Ala Gly Pro Asp Pro Val Ala He Thr Thr Glu Ser Val Gly Lys 
260 265 270 

Thr lie Ala Gly Ala Lys He Met Gly Gin Gly Asn Asp Leu Val Leu 
275 280 285 

Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro He 
290 295 300 

Val Leu Ala Thr Tyr Glu He Val Cys Ser Lys Tyr Pro Asp Ala Thr 
305 310 315 ' 320 

Thr Gly Thr Ala Val Arg Ala Phe Met Gin Ala Ala He Gly Pro Gly 
325 330 335 

Gin Glu Gly Leu Asp Gin Tyr Gly Ser He Pro Leu Pro Lys Ser Phe 
340 345 350 

Gin Ala Lys Leu Ala Ala Ala Val Asn Ala He Ser 
355 360 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 309 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Gin Ala Ala Ala Gly Arg Ala Val Arg Arg Thr Gly His Ala Glu Asp 
1 5 10 15 

Gin Thr His Gin Asp Arg Leu His His Gly Cys Arg Arg Ala Ala Val 
20 25 30 

Val Val Arg Gin Asp Arg Ala Ser Val Ser Ala Thr Ser Ala Arg Pro 
35 4 0 4 5 

Pro Arg Arg His Pro Ala Gin Gly His Arg Arg Arg Val Ala Pro Ser 
50 55 60 
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Gly Gly Arg Arg Arg Pro His Pro His His Val Gin Pro Asp Asp Arg 
65 70 7 5 

Arg Asp Arg Pro Ala Leu Leu Asp Arg Thr Gin Pro Ala Glu His Pro 

8 5 9° 95 

Asp Pro His Arg Arg Gly Pro Ala Asp Pro Gly Arg Val Arg Gly Arg 
100 1° 5 

Gly Arg Leu Arg Arg Val Asp Asp Gly Arg Leu Gin Pro Asp Arg Asp 



115 



Ala Asp His Gly Ala Pro Val Arg Gly Arg Gly Pro His Arg Gly Val 
130 ' 135 140 

Gin His Arg Gly Gly Pro Val Phe Val Arg Arg Val Pro Gly Val Arg 
145 150 155 160 

Cys Ala His Arg Arg Gly His Arg Arg Val Ala Ala Pro Gly Gin Gly 
165 170 175 

Asp Val Leu Arg Ala Gly Leu Arg Val Glu Arg Leu Arg Pro Val Ala 
180 185 190 

Ala Val Glu Asn Leu His Arg Gly Ser Gin Arg Ala Asp Gly Arg Val 
200 205 



195 



Phe Arg Pro He Arg Arg Gly Ala Arg Leu Pro Ala Arg Arg Ser Arg 
210 215 220 

Ala Gly Pro Gin Gly Arg Leu His Leu Asp Gly Ala Gly Pro Ser Pro 
225 230 235 240 

Leu Pro Ala Arg Ala Gly Gin Gin Gin Pro Ser Ser Ala Gly Gly Arg 
245 250 255 

Arq Ala Gly Gly Ala Glu Arg Ala Asp Pro Gly Gin Arg Gly Arg His 
260 265 270 

His Gin Gly Gly His Asp Pro Gly Arg Gin Gly Ala Gin Arg Gly Thr 
275 280 285 

Ala Glv Val Ala His Ala Ala Ala Gly Pro Arg Arg Ala Ala Val Arg 
290 295 300 

Asn Arg Pro Arg Arg 
305 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
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Ser Ala Val Trp Cys Leu Asn Gly Phe Thr Gly Arg His Arg His Gly 
1 5 10 15 

Arg Cys Arg Val Arg Ala Ser Gly Trp Arg Ser Ser Asn Arg Trp Cys 
20 25 30 

Ser Thr Thr Ala Asp Cys Cys Ala Ser Lys Thr Pro Thr Gin Ala Ala 
35 40 45 

Ser Pro Leu Glu Arg Arg Phe Thr Cys Cys Ser Pro Ala Val Gly Cys 
50 55 " 60 

Arg Phe Arg Ser Phe Pro Val Arg Arg Leu Ala Leu Gly Ala Arg Thr 
65 70 75 80 

Ser Arg Thr Leu Gly Val Arg Arg Thr Leu Ser Gin Trp Asn Leu Ser 
85 90 95 

Pro Arg Ala Gin Pro Ser Cys Ala Val Thr Val Glu Ser His Thr His 
100 105 110 

Ala Ser Pro Arg Met Ala Lys Leu Ala Arg Val Val Gly Leu Val Gin 
115 120 125 

Glu Glu Gin Pro Ser Asp Met Thr Asn His Pro Arg Tyr Ser Pro Pro 
130 135 140 

Pro Gin Gin Pro Gly Thr Pro Gly Tyr Ala Gin Gly Gin Gin Gin Thr 
145 150 155 ~ 160 

Tyr Ser Gin Gin Phe Asp Trp Arg Tyr Pro Pro Ser Pro Pro Pro Gin 
165 170 175 

Pro Thr Gin Tyr Arg Gin Pro Tyr Glu Ala Leu Gly Gly Thr Arg Pro 
180 185 190 

Gly Leu lie Pro Gly Val lie Pro Thr Met Thr Pro Pro Pro Gly Met 
195 200 205 

Val Arg Gin Arg Pro Arg Ala Gly Met Leu Ala lie Gly Ala Val Thr 
210 215 220 

lie Ala Val Val Ser Ala Gly lie Gly Gly Ala Ala Ala Ser Leu Val 
225 230 235 240 

Gly Phe Asn Arg Ala Pro Ala Gly Pro Ser Gly Gly Pro Val Ala Ala 
245 250 " 255 

Ser Ala Ala Pro Ser lie Pro Ala Ala Asn Met Pro Pro Gly Ser Val 
260 265 270 

Glu Gin Val Ala Ala Lys Val Val Pro Ser Val Val Met Leu Glu Thr 
275 " 280 285 

Asp Leu Gly Arg Gin Ser Glu Glu Gly Ser Gly lie lie Leu Ser Ala 
290 295 300 

Glu Gly Leu lie Leu Thr Asn Asn His Val lie Ala Ala Ala Ala Lys 
305 310 315 320 

Pro Pro Leu Gly Ser Pro Pro Pro Lys Thr Thr Val Thr Phe Ser Asp 
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325 330 335 

Gly Arg Thr Ala Pro Phe Thr Val Val Gly Ala Asp Pro Thr Ser Asp 
34 0 34 5 350 

He Ala Val Val Arg Val Gin Gly Val Ser Gly Leu Thr Pro He Ser 
355 360 365 

Leu Gly Ser Ser Ser Asp Leu Arg Val Gly Gin Pro Val Leu Ala He 
370 375 380 

Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly He Val Ser 
385 390 395 400 

Ala Leu Asn Arg Pro Val Ser Thr Thr Gly Glu Ala Gly Asn Gin Asn 
405 410 415 

Thr Val Leu Asp Ala He Gin Thr Asp Ala Ala He Asn Pro Gly Asn 
420 425 430 

Ser Gly Gly Ala Leu Val Asn Met Asn Ala Gin Leu Val Gly Val Asn 
435 440 445 

Ser Ala He Ala Thr Leu Gly Ala Asp Ser Ala Asp Ala Gin Ser Gly 
450 455 460 

Ser He Gly Leu Gly Phe Ala He Pro Val Asp Gin Ala Lys Arg He 
465 470 475 480 

Ala Asp Glu Leu He Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly 
485 490 495 

Val Gin Val Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys He Val Glu 
500 ' 505 510 

Val Val Ala Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val 
5X5 ~ 520 525 

Val Val Thr Lys Val Asp Asp Arg Pro He Asn Ser Ala Asp Ala Leu 
530 ~ 535 540 

Val Ala Ala Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr 
545 550 555 560 

Phe Gin Asp Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly 
565 " 570 575 



Lys Ala Glu Gin 
580 



(2) INFORMATION FOR SEQ ID NO : 7 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 6 : 

Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val Val Leu 
15 10 15 

Gly Ala Cys Leu Ala Leu Trp Leu Ser Gly Cys Ser Ser Pro Lys Pro 
20 25 30 

Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr Ala Ser Asp Pro 
35 4 0 4 5 

Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala Thr Lys Gly Leu 
50 55 60 

Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys Val Asp Ser Leu 
65 70 75 ~ 1 80 

Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala Asn Pro Leu Ala 
85 90 95 

Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly Val Pro Phe Arg 
100 105 HO 

Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp Asp Trp Ser Asn 
115 120 125 

Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val Leu Asp Pro Ala 
130 135 140 

Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn Leu Gin Ala Gin 
145 150 155 160 

Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys lie Thr Gly Thr 
165 170 " ' 175 

lie Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly Ala Lys Ser Ala 
180 185 190 

Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser His His Leu Val 
195 200 205 

Arg Ala Ser lie Asp Leu Gly Ser Gly Ser lie Gin Leu Thr Gin Ser 
210 215 220 

Lys Trp Asn Glu Pro Val Asn Val Asp 
225 230 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Val He Asp lie lie Gly Thr Ser Pro Thr Ser Trp Glu Gin Ala Ala 
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Ala Glu Ala Val 
20 

Ala Arg Val He 

• 3b 

Thr Tyr Arg He 
50 



5 

Gin Arg Ala Arg 

Glu Gin Asp Met 
40 

Lys Leu Glu Val 
55 



10 

Asp Ser Val Asp 
25 

Ala Val Asp Ser 

Ser Phe Lys Met 
60 



15 

Asp He Arg Val 
30 

Ala Gly Lys He 
45 

Arg Pro Ala Gin 



Pro Arg 
65 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Val Pro Pro Ala Pro Pro Leu Pro Pro Leu Pro Pro Ser Pro lie Ser 
1 5 10 15 

Cvs Ala Ser Pro Pro Ser Pro Pro Leu Pro Pro Ala Pro Pro Val Ala 
20 25 30 

Pro Gly Pro Pro Met Pro Pro Leu Asp Pro Trp Pro Pro Ala Pro Pro 
35 40 45 

Leu Pro Tyr Ser Thr Pro Pro Gly Ala Pro Leu Pro Pro Ser Pro Pro 
50 55 60 

Ser Pro Pro Leu Pro 
65 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: 

Met Ser Asn Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu Leu Ser 
1 5 10 1D 

Val Leu Ala Ala Val Gly Leu Gly Leu Ala Thr Ala Pro Ala Gin Ala 
20 25 30 
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Ala Pro Pro Ala Leu Ser Gin Asp Arg Phe Ala Asp Phe Pro Ala Leu 
35 40 45 

Pro Leu Asp Pro Ser Ala Met Val Ala Gin Val Ala Pro Gin Val Val 
50 55 60 

Asn lie Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr 
65 70 75 " 80 

Gly lie Val lie Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val 
85 90 95 

lie Ala Gly Ala Thr Asp lie Asn Ala Phe Ser Val Gly Ser Gly Gin 
100 105 HO 

Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gin Asp Val Ala 
115 120 125 

Val Leu Gin Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala lie Gly 
130 135 ~ 140 

Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser Gly 
145 150 155 160 

Gly Gin Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu 
165 170 175 

Gly Gin Thr Val Gin Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr 
180 185 190 

Leu Asn Gly Leu lie Gin Phe Asp Ala Ala lie Gin Pro Gly Asp Ser 
195 200 205 

Gly Gly Pro Val Val Asn Gly Leu Gly Gin Val Val Gly Met Asn Thr 
210 215 220 

Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe Ala 
225 230 235 " 240 

lie Pro lie Gly Gin Ala Met Ala lie Ala Gly Gin lie Arg Ser Gly 
245 250 255 

Gly Gly Ser Pro Thr Val His lie Gly Pro Thr Ala Phe Leu Gly Leu 
260 265 270 

Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val Val 
275 280 28 5 

Gly Ser Ala Pro Ala Ala Ser Leu Gly lie Ser Thr Gly Asp Val lie 
290 295 300 

Thr Ala Val Asp Gly Ala Pro lie Asn Ser Ala Thr Ala Met Ala Asp 
305 310 315 320 

Ala Leu Asn Gly His His Pro Gly Asp Val lie Ser Val Asn Trp Gin 
325 330 335 

Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu Gly 
340 345 350 

Pro Pro Ala 
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355 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 205 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Ser Pro Lys Pro Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr 
1 ' 5 10 15 

Ala Ser Asp Pro Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala 
20 25 30 

Thr Lys Gly Leu Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys 
35 40 45 

Val Asp Ser Leu Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala 
50 55 60 

Asn Pro Leu Ala Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly 
65 70 75 80 

Val Pro Phe Arg Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp 
85 90 95 

Asp Trp Ser Asn Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val 
100 105 HO 

Leu Asp Pro Ala Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn 
115 ' 120 125 

Leu Gin Ala Gin Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys 
130 ~ 135 140 

lie Thr Gly Thr lie Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly 
145 150 155 160 

Ala Lys Ser Ala Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser 
165 170 175 

His His Leu Val Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin 
180 185 190 

Leu Thr Gin Ser Lys Trp Asn Glu Pro Val Asn Val Asp 
195 200 205 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Gly Asp Ser Phe Trp Ala Ala Ala Asp Gin Met Ala Arg Gly Phe Val 
15 10 15 

Leu Gly Ala Thr Ala Gly Arg Thr Thr Leu Thr Gly Glu Gly Leu Gin 
20 25 30 

His Ala Asp Gly His Ser Leu Leu Leu Asp -Ala Thr Asn Pro Ala Val 
35 40 45 

Val Ala Tyr Asp Pro Ala Phe Ala Tyr Glu lie Gly Tyr lie Xaa Glu 
50 55 60 

Ser Gly Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn lie Phe Phe 
65 70 75 80 

Tyr He Thr Val Tyr Asn Glu Pro Tyr Val Gin Pro Pro Glu Pro Glu 
85 90 95 

Asn Phe Asp Pro Glu Gly Val Leu Gly Gly He Tyr Arg Tyr His Ala 
100 105 HO 

Ala Thr Glu Gin Arg Thr Asn Lys Xaa Gin He Leu Ala Ser Gly Val 
115 120 125 

Ala Met Pro Ala Ala Leu Arg Ala Ala Gin Met Leu Ala Ala Glu Trp 
130 135 140 

Asp Val Ala Ala Asp Val Trp Ser Val Thr Ser Trp Gly Glu Leu Asn 
145 150 155 ' " 160 

• Arg Asp Gly Val Val He Glu Thr Glu Lys Leu Arg His Pro Asp Arg 

165 170 175 

Pro Ala Gly Val Pro Tyr Val Thr Arg Ala Leu Glu Asn Ala Arg Gly 
180 185 190 

Pro Val He Ala Val Ser Asp Trp Met Arg Ala Val Pro Glu Gin He 
195 200 205 

Arg Pro Trp Val Pro Gly Thr Tyr Leu Thr Leu Gly Thr Asp Gly Phe 
210 215 220 

Gly Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr Phe Asn Thr Asp 
225 230 235 240 

Ala Glu Ser Gin Val Gly Arg Gly Phe Gly Arg Gly Trp Pro Gly Arg 
245 250 255 

Arg Val Asn He Asp Pro Phe Gly Ala Gly Arg Gly Pro Pro Ala Gin 
260 265 " 270 

Leu Pro Gly Phe Asp Glu Gly Gly Gly Leu Arg Pro Xaa Lys 
275 280 285 

(2) INFORMATION FOR SEQ ID NO: 82: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Thr Lys Phe His Ala Leu Met Gin Glu Gin He His Asn Glu Phe Thr 
1 5 10 15 

Ala Ala Gin Gin Tyr Val Ala He Ala Val Tyr Phe Asp Ser Glu Asp 

25 30 



20 



Leu Pro Gin Leu Ala Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg 
35 4 0 4 5 

Asn His Ala Met Met Leu Val Gin His Leu Leu Asp Arg Asp Leu Arg 
50 55 60 

Val Glu He Pro Gly Val Asp Thr Val Arg Asn Gin Phe Asp Arg Pro 
65 70 75 80 

Ara Glu Ala Leu Ala Leu Ala Leu Asp Gin Glu Arg Thr Val Thr Asp 
85 95 

Gin Val Gly Arg Leu Thr Ala Val Ala Arg Asp Glu Gly Asp Phe Leu 
100 1° 5 110 

Glv Glu Gin Phe Met Gin Trp Phe Leu Gin Glu Gin He Glu Glu Val 
115 120 125 

Ala Leu Met Ala Thr Leu Val Arg Val Ala Asp Arg Ala Gly Ala Asn 
130 135 140 

Leu Phe Glu Leu Glu Asn Phe Val Ala Arg Glu Val Asp Val Ala Pro 
145 150 155 160 

Ala Ala Ser Gly Ala Pro His. Ala Ala Gly Gly Arg Leu 
165 170 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Arg Ala Asp Glu Arg Lys Asn Thr Thr Met Lys Met Val Lys Ser I 
1 5 1° lb 
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Ala Ala Gly Leu Thr Ala Ala Ala Ala lie Gly Ala Ala Ala Ala Gly 
20 25 30 

Val Thr Ser lie Met Ala Gly Gly Pro Val Val Tyr Gin Met Gin Pro 
35 40 45 

Val Val Phe Gly Ala Pro Leu Pro Leu Asp Pro Xaa Ser Ala Pro Xaa 
50 55 60 

Val Pro Thr Ala Ala Gin Trp Thr Xaa Leu Leu Asn Xaa Leu Xaa Asp 
65 70 75 80 

Pro Asn Val Ser Phe Xaa Asn Lys Gly Ser Leu Val Glu Gly Gly lie 
85 90 95 

Gly Gly Xaa Glu Gly Xaa Xaa Arg Arg Xaa Gin 
100 105 

(2) INFORMATION FOR SEQ ID NO: 84: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Val Leu Ser Val Pro Val Gly Asp Gly Phe Trp Xaa Arg Val Val Asn 
1 5 10 15 

Pro Leu Gly Gin Pro lie Asp Gly Arg Gly Asp Val Asp Ser Asp Thr 
20 2 5 .30 

Arg Arg Ala Leu Glu Leu Gin Ala Pro Ser Val Val Xaa Arg Gin Gly 
35 4 0 4 5 

Val Lys Glu Pro Leu Xaa Thr Gly lie Lys Ala lie Asp Ala Met Thr 
50 55 60 

Pro lie Gly Arg Gly Gin Arg Gin Leu lie lie Gly Asp Arg Lys Thr 
65 70 75 80 

Gly Lys Asn Arg Arg Leu Cys Arg Thr Pro Ser Ser Asn Gin Arg Glu 
85 90 95 

Glu Leu Gly Val Arg Trp lie Pro Arg Ser Arg Cys Ala Cys Val Tyr 
100 105 110 

Val Gly His Arg Ala Arg Arg Gly Thr Tyr His Arg Arg 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:85: 

Cys Asp Ala Val Met Gly Phe Leu Gly Gly Ala Gly Pro Leu Ala Val 
1 5 10 15 

Val Asp Gin Gin Leu Val Thr Arg Val Pro Gin Gly Trp Ser Phe Ala 
20 25 30 

Gin Ala Ala Ala Val Pro Val Val Phe Leu Thr Ala Trp Tyr Gly Leu 
35 40 45 

Ala Asp Leu Ala Glu He Lys Ala Gly Glu Ser Val Leu He His Ala 
50 55 60 

Gly Thr Gly Gly Val Gly Met Ala Ala Val Gin Leu Ala Arg Gin Trp 
65 "70 75 80 

Glv Val Glu Val Phe Val Thr Ala Ser Arg Gly Lys Trp Asp Thr Leu 
-- 90 95 



85 



Arq Ala Xaa Xaa Phe Asp Asp Xaa Pro Tyr Arg Xaa Phe Pro His Xaa 
100 105 HO 

Arg Ser Ser Xaa Gly 
115 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Met Tyr Arg Phe Ala Cys Arg Thr Leu Met Leu Ala Ala Cys lie Leu 
1 5 1° lb 

Ala Thr Gly Val Ala Gly Leu Gly Val Gly Ala Gin Ser Ala Ala Gin 
20 25 30 

Thr Ala Pro Val Pro Asp Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp 
35 40 45 

Pro Ala Trp Gly Pro Asn Trp Asp Pro Tyr Thr Cys His Asp Asp Phe 
50 55 60 

His Arg Asp Ser Asp Gly Pro Asp His Ser Arg Asp Tyr Pro Gly Pro 
65 70 75 80 

He Leu Glu Gly Pro Val Leu Asp Asp Pro Gly Ala Ala Pro Pro Pro 
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85 90 95 

Pro Ala Ala Gly Gly Gly Ala 
100 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87 : 

Val Gin Cys Arg Val Trp Leu Glu He Gin Trp Arg Gly Met Leu Gly 
1 5 10 15 

Ala Asp Gin Ala Arg Ala Gly Gly Pro Ala Arg He Trp Arg Glu His 
20 25 30 

Ser Met Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala 
35 40 45 

Thr Lys Glu Gly Arg Gly He Val Met Arg Val Pro Leu Glu Gly Gly 
50 55 60 

Gly Arg Leu Val Val Glu Leu Thr Pro Asp Glu Ala Ala Ala Leu Gly 
65 70 75 80 

Asp Glu Leu Lys Gly Val Thr Ser 
85 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He 
15 10 15 

Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly 

20 25 30 

Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala 
35 40 45 

Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu 
50 55 " 60 
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Asp Glu lie Ser Thr Asn lie Arg Gin Ala Gly Val Gin Tyr Ser Arg 
65 70 75 80 

Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
85 90 95 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu He Leu Asn 
1 5 10 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp Val 
20 25 30 

Pro He Thr Pro Cys Glu Leu Thr Xaa Xaa Lys Asn Ala Ala Gin Gin 
35 J 40 45 

Xaa Val Leu Ser Ala Asp Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala 
50 55 60 

Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lys Xaa 
65 " 70 75 80 

Tvr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
85 90 95 

Glu Glv Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 
100 105 HO 

Ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 

Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 
130 135 140 

Gin Gly Ala Ser Leu Ala His Xaa Gly Asp Gly Trp Asn Thr Xaa Thr 
145 " 150 155 160 

Leu Thr Leu Gin Gly Asp 
165 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Arq Ala Glu Arg Met 
1 5 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala 
1 5 10 15 

Gin Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr 
20 25 30 

Val Pro Pro Pro Val lie Ala Glu Asn Arg Ala Glu Leu Met lie Leu 
35 40 45 

lie Ala Thr Asn Leu Leu Gly Gin Asn Thr Pro Ala lie Ala Val Asn 
50 55 60 

Glu Ala Glu Tyr Gly Glu Met Trp Ala Gin Asp Ala Ala Ala Met Phe 
65 70 75 80 

•Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 
85 90 95 

Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gin Ala 
100 105 110 

Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gin Leu Met 
115 120 125 

Asn Asn Val Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr Gin Gly 
130 135 140 

Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 
145 150 155 160 

His Arg Ser Pro lie Ser Asn Met Val Ser Met Ala Asn Asn His Met 
165 170 175 

Ser Met Thr Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser Met 
180 185 190 

Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala Gin Ala Val Gin Thr Ala 
195 200 205 
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Ala Gin Asn Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu Gly 
~- " 215 220 



210 



Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala 
* 2 30 235 Z4U 



225 



Ser Val Arg Tyr Gly His Arg Asp Gly Gly Lys Tyr Ala Xaa Ser Gly 
o/ic 9 SO z. b b 



245 " 250 

Arg Arg Asn Gly Gly Pro Ala 
2 60 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Met Thr Tyr Ser Pro Gly Asn Pro Gly Tyr Pro Gin Ala Gin Pro Ala 
1 5 10 15 

Gly Ser Tyr Gly Gly Val Thr Pro Ser Phe Ala His Ala Asp Glu Gly 
20 25 30 

Ala Ser Lys Leu Pro Met Tyr Leu Asn He Ala Val Ala Val Leu Gly 
35 40 45 

Leu Ala Ala Tyr Phe Ala Ser Phe Gly Pro Met Phe Thr Leu Ser Thr 
50 ' 55 60 

Glu Leu Gly Gly Gly Asp Gly Ala Val Ser Gly Asp Thr Gly Leu Pro 
65 70 

Val Gly Val Ala Leu Leu Ala Ala Leu Leu Ala Gly Val Val Leu Val 
85 90 y5 

Pro Lys Ala Lys Ser His Val Thr Val Val Ala Val Leu Gly Val Leu 
100 105 110 

Glv Val Phe Leu Met Val Ser Ala Thr Phe Asn Lys Pro Ser Ala Tyr 
115 120 125 

Ser Thr Gly Trp Ala Leu Trp Val Val Leu Ala Phe He Val Phe Gin 
130 * I 35 140 

Ala Val Ala Ala Val Leu Ala Leu Leu Val Glu Thr Gly Ala lie Thr 
145 150 155 AbU 

Ala Pro Ala Pro Arg Pro Lys Phe Asp Pro Tyr Gly Gin Tyr Gly Arg 
165 l" 70 1/3 

Tyr Gly Gin Tyr Gly Gin Tyr Gly Val Gin Pro Gly Gly Tyr Tyr Gly 
180 I 85 ly 
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Gin Gin Gly Ala Gin Gin Ala Ala Gly Leu Gin Ser Pro Gly Pro Gin 
195 200 205 

Gin Ser Pro Gin Pro Pro Gly Tyr Gly Ser Gin Tyr Gly Gly Tyr Ser 
210 215 220 

Ser Ser Pro Ser Gin Ser Gly Ser Gly Tyr Thr' Ala Gin Pro Pro Ala 
225 230 235 240 

Gin Pro Pro Ala Gin Ser Gly Ser Gin Gin Ser His Gin Gly Pro Ser 
245 250 255 

Thr Pro Pro Thr Gly Phe • Pro Ser Phe Ser Pro Pro Pro Pro Val Ser 
260 265 270 

Ala Gly Thr Gly Ser Gin Ala Gly Ser Ala Pro Val Asn Tyr Ser Asn 
275 280 285 

Pro Ser Gly Gly Glu Gin Ser Ser Ser Pro Gly Gly Ala Pro Val 
290 295 " 300 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Gly Cys Gly Glu Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn 
15 10 15 

Phe Glu Arg lie Ser Gly Asp Leu Lys Thr Gin lie 
20 25 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 95: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Gly Cys Gly Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly 
1 5 10 lb 

Ala Gly Thr Ala Ala Gin Ala Ala Val Val Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Gly Cys Gly Gly Thr Ala Ala Gin Ala Ala Val Val Arg Phe Gin 
1 5 10 15 

Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 97: 

.. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 




Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg 
20 25 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Gly Cys Gly lie Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu 
1 5 10 15 

Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
20 25 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 507 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 



ATGAAGATGG 


TGAAATCGAT 


CGCCGCAGGT 


CTGACCGCCG 


CGGCTGCAAT 


CGGCGCCGCT 


60 


GCGGCCGGTG 


TGACTTCGAT 


CATGGCTGGC 


GGCCCGGTCG 


T AT AC C AG AT 


GCAGCCGGTC 


120 


GTCTTCGGCG 


CGCCACTGCC 


GTTGGACCCG 


GCATCCGCCC 


CTGACGTCCC 


GACCGCCGCC 


180 


CAGTTGACCA 


GCCTGCTCAA 


CAGCCTCGCC 


GATCCCAACG 


TGTCGTTTGC 


GAACAAGGGC 


240 


AGTCTGGTCG 


AGGGCGGCAT 


CGGGGGCACC 


GAGGCGCGCA 


TCGCCGACCA 


CAAGCTGAAG 


300 


AAGGCCGCCG 


AGCACGGGGA 


TCTGCCGCTG 


TCGTTCAGCG 


TGACGAACAT 


CCAGCCGGCG 


360 


GCCGCCGGTT 


CGGCCACCGC 


CGACGTTTCC 


GTCTCGGGTC 


CGAAGCTCTC 


GTCGCCGGTC 


420 


ACGCAGAACG 


TCACGTTCGT 


GAATCAAGGC 


GGCTGGATGC 


TGTCACGCGC 


ATCGGCGATG 


480 


GAGTTGCTGC 


AGGCCGCAGG 


GAACTGA 








507 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
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Met Lys Met Val Lys Ser lie Ala Ala Gly Leu Thr Ala Ala Ala Ala 
15 10 15 

lie Gly Ala Ala Ala Ala Gly Val Thr Ser lie Met Ala Gly Gly Pro 
20 25 30 

Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro Leu Pro Leu 
35 40 45 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
50 55 60 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn Lys Gly 
65 70 75 80 

Ser Leu Val Glu Gly Gly lie Gly Gly Thr Glu Ala Arg lie Ala Asp 
85 90 95 

His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro Leu Ser Phe 
100 105 110 

Ser Val Thr Asn lie Gin Pro Ala Ala Ala Gly Ser Ala Thr Ala Asp 
115 120 125 

Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr Gin Asn Val 
130 ^ 135 140 

Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala Ser Ala Met 
145 150 155 160 

Glu Leu Leu Gin Ala Ala Gly Asn 
165 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

CGTGGCAATG TCGTTGACCG TCGGGGCCGG GGTCGCCTCC GCAGATCCCG TGGACGCGGT 60 

CATTAACACC ACCTGCAATT ACGGGCAGGT AGTAGCTGCG CTCAACGCGA CGGATCCGGG 120 

GGCTGCCGCA CAGTTCAACG CCTCACCGGT GGCGCAGTCC TATTTGCGCA ATTTCCTCGC 180 

CGCACCGCCA CCTCAGCGCG CTGCCATGGC CGCGCAATTG CAAGCTGTGC CGGGGGCGGC 240 

ACAGTACATC GGCCTTGTCG AGTCGGTTGC CGGCTCCTGC AACAACTATT AAGCCCATGC 300 

GGGCCCCATC CCGCGACCCG GCATCGTCGC CGGGGCTAGG CCAGATTGCC CCGCTCCTCA 360 

ACGGGCCGCA TCCCGCGACC CGGCATCGTC GCCGGGGCTA GGCCAGATTG CCCCGCTCCT 4 20 

CAACGGGCCG CATCTCGTGC CGAATTCCTG CAGCCCGGGG GATCCACTAG TTCTAGAGCG 4 80 
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GCCGCCACCG CGGTGGAGCT 
(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:102: 

Val Ala Met Ser Leu Thr Val Gly Ala Gly Val Ala Ser Ala Asp Pro 
15 10 15 

Val Asp Ala Val He Asn Thr Thr Cys Asn Tyr Gly Gin Val Val Ala 
20 25 30 

Ala Leu Asn Ala Thr Asp Pro Gly Ala Ala Ala Gin Phe Asn Ala Ser 
35 40 45 

Pro Val Ala Gin Ser Tyr Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro 
50 55 60 

Gin Arg Ala Ala Met Ala Ala Gin Leu Gin Ala Val Pro Gly Ala Ala 
65 70 75 80 

Gin Tyr He Gly Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 
85 90 ' 95 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
ATGACAGAGC AGCAGTGGAA TTTCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCCAGGGA 
AATGTCACGT CCATTCATTC CCTCCTTGAC GAGGGGAAGC AGTCCCTGAC CAAGCTCGCA 
GCGGCCTGGG GCGGTAGCGG TTCGGAAGCG TACC 
(2) INFORMATION FOR SEQ ID NO: 104 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Met Thr Glu Gin Gin Trp Asn Phe Ala Gly He Glu Ala Ala Ala Ser 
1 5 10 15 

Ala He Gin Gly Asn Val Thr Ser He His Ser Leu Leu Asp Glu Gly 
20 25 30 

Lys Gin Ser Leu Thr Lys Leu Ala Ala Ala Trp Gly Gly Ser Gly Ser 
35 40 45 



Glu Ala Tyr 
50 



(2) INFORMATION FOR SEQ ID NO: 105: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
CGGTCGCGCA CTTCCAGGTG ACT AT G AAAG TCGGCTTCCG NCTGGAGGAT TCCTGAACCT 
TCAAGCGCGG CCGATAACTG AGGTGCATCA TTAAGCGACT TTTCCAGAAC ATCCTGACGC 
GCTCGAAACG CGGCACAGCC GACGGTGGCT CCGNCGAGGC GCTGNCTCCA AAATCCCTGA 
GACAATTCGN CGGGGGCGCC TACAAGGAAG TCGGTGCTGA ATTCGNCGNG TATCTGGTCG 
ACCTGTGTGG TCTGNAGCCG GACGAAGCGG TGCTCGACGT CG 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
GATCGTACCC GTGCGAGTGC TCGGGCCGTT TGAGGATGGA GTGCACGTGT CTTTCGTGAT 
GGCATACCCA GAGATGTTGG CGGCGGCGGC TGACACCCTG CAGAGCATCG GTGCTACCAC 
TGTGGCTAGC AATGCCGCTG CGGCGGCCCC GACGACTGGG GTGGTGCCCC CCGCTGCCGA 
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TGAGGTGTCG GCGCTGACTG CGGCGCACTT CGCCGCACAT GCGGCGATGT ATCAGTCCGT 24 0 

GAGCGCTCGG GCTGCTGCGA TTCATGACCA GTTCGTGGCC ACCCTTGCCA GCAGCGCCAG 300 
CTCGTATGCG GCCACTGAAG TCGCCAATGC GGCGGCGGCC AGCTAAGCCA GGAACAGTCG 3 60 

GCACGAGAAA CCACGAGAAA TAGGGACACG TAATGGTGGA TTTCGGGGCG TTACCACCGG 4 20 

AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC CTCGCTGGTG GCCGCGGCTC 4 80 

AGATGTGGGA CAGCGTGGCG AGTGACCTGT TTTCGGCCGC GTCGGCGTTT CAGTCGGTGG 54 0 

TCTGGGGTCT GACGGTGGGG TCGTGGATAG GTTCGTCGGC GGGTCTGATG GTGGCGGCGG 600 
CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA GGCCGAGCTG ACCGCCGCCC 6 60 

AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG GCTGACGGTG CCCCCGCCGG 720 
TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC GACCAACCTC TTGGGGCAAA 7 80 

ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGCGA GATGTGGGCC CAAGACGCCG 84 0 

CCGCGATGTT TGGCTACGCC GCGGCGACGG CGACGGCGAC GGCGACGTTG CTGCCGTTCG 900 
AGGAGGCGCC GGAGATGACC AGCGCGGGTG GGCTCCTCGA GCAGGCCGCC GCGGTCGAGG 960 

AGGCCTCCGA CACCGCCGCG GCGAACCAGT TGATGAACAA TGTGCCCCAG GCGCTGCAAC 1020 

AGCTGGCCCA GCCCACGCAG GGCACCACGC CTTCTTCCAA GCTGGGTGGC CTGTGGAAGA 108 0 

CGGTCTCGCC GCATCGGTCG CCGATCAGCA ACATGGTGTC GATGGCCAAC AACCACATGT 114 0 

CGATGACCAA CTCGGGTGTG TCGATGACCA ACACCTTGAG CTCGATGTTG AAGGGCTTTG 1200 

CTCCGGCGGC GGCCGCCCAG GCCGTGCAAA CCGCGGCGCA AAACGGGGTC CGGGCGATGA 12 60 

GCTCGCTGGG CAGCTCGCTG GGTTCTTCGG GTCTGGGCGG TGGGGTGGCC GCCAACTTGG 1320 

GTCGGGCGGC CTCGGTCGGT TCGTTGTCGG TGCCGCAGGC CTGGGCCGCG GCCAACCAGG 1380 

CAGTCACCCC GGCGGCGCGG GCGCTGCCGC TGACCAGCCT GACCAGCGCC GCGGAAAGAG 14 4 0 

GGCCCGGGCA GATGCTGGGC GGGCTGCCGG TGGGGCAGAT GGGCGCCAGG GCCGGTGGTG 1500 

GGCTCAGTGG TGTGCTGCGT GTTCCGCCGC GACCCTATGT GATGCCGCAT TCTCCGGCGG 15 60 

CCGGCTAGGA GAGGGGGCGC AGACTGTCGT TATTTGACCA GTGATCGGCG GTCTCGGTGT 1620 

TTCCGCGGCC GGCTATGACA ACAGTCAATG TGCATGACAA GTTACAGGTA TTAGGTCCAG 1680 

GTTCAACAAG GAGACAGGCA ACATGGCCTC ACGTTTTATG ACGGATCCGC ACGCGATGCG 17 4 0 

GGACATGGCG GGCCGTTTTG AGGTGCACGC CCAGACGGTG GAGGACGAGG CTCGCCGGAT 18 00 

GTGGGCGTCC GCGCAAAACA TTTCCGGTGC GGGCTGGAGT GGCATGGCCG AGGCGACCTC 18 60 

GCTAGACACC ATGGCCCAGA TGAATCAGGC GTTTCGCAAC ATCGTGAACA TGCTGCACGG 1920 

GGTGCGTGAC GGGCTGGTTC GCGACGCCAA CAACTACGAG CAGCAAGAGC AGGCCTCCCA 198 0 

GCAGATCCTC AGCAGCTAAC GTCAGCCGCT GCAGCACAAT ACTTTTACAA GCGAAGGAGA 204 0 
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ACAGGTTCGA TGACCATCAA CTATCAATTC GGGGATGTCG ACGCTCACGG CGCCATGATC 
CGCGCTCAGG CCGGGTTGCT GGAGGCCGAG CATCAGGCCA TCATTCGTGA TGTGTTGACC 
GCGAGTGACT TTTGGGGCGG CGCCGGTTCG GCGGCCTGCC AGGGGTTCAT TACCCAGTTG 
GGCCGTAACT TCCAGGTGAT CTACGAGCAG GCCAACGCCC ACGGGCAGAA GGTGCAGGCT 
GCCGGCAACA ACATGGCGCA AACCGACAGC GCCGTCGGCT CCAGCTGGGC CTGACACCAG 
GCCAAGGCCA GGGACGTGGT GTACGAGTGA AGTTCCTCGC GTGATCCTTC GGGTGGCAGT 
CTAAGTGGTC AGTGCTGGGG TGTTGGTGGT TTGCTGCTTG GCGGGTTCTT CGGTGCTGGT 
CAGTGCTGCT CGGGCTCGGG TGAGGACCTC GAGGCCCAGG TAGCGCCGTC CTTCGATCCA 
TTCGTCGTGT TGTTCGGCGA GGACGGCTCC GACGAGGCGG ATGATCGAGG CGCGGTCGGG 
GAAGATGCCC ACGACGTCGG TTCGGCGTCG TACCTCTCGG TTGAGGCGTT CCTGGGGGTT 
GTTGGACCAG ATTTGGCGCC AGATCTGCTT GGGGAAGGCG GTGAACGCCA GCAGGTCGGT 
GCGGGCGGTG TCGAGGTGCT CGGCCACCGC GGGGAGTTTG TCGGTCAGAG CGTCGAGTAC 
CCGATCATAT TGGGCAACAA CTGATTCGGC GTCGGGCTGG TCGTAGATGG AGTGCAGCAG 
GGTGCGCACC CACGGCCAGG AGGGCTTCGG GGTGGCTGCC ATCAGATTGG CTGCGTAGTG 
GGTTCTGCAG CGCTGCCAGG CCGCTGCGGG CAGGGTGGCG CCGATCGCGG CCACCAGGCC 
GGCGTGGGCG TCGCTGGTGA CCAGCGCGAC CCCGGACAGG CCGCGGGCGA CCAGGTCGCG 
GAAGAACGCC AGCCAGCCGG CCCCGTCCTC GGCGGAGGTG ACCTGGATGC CCAGGATC 
(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 391 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3058 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Met Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
1 5 10 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Gin Met Trp 
20 25 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 

40 qD 



35 



Val Val Trp Gly Leu Thr Val Gly Ser Trp He Gly Ser Ser Ala Gly 

50 55 
Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
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65 



70 



75 



80 



Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 " 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val He Ala 
100 105 HO 

Glu Asn Arg Ala Glu Leu Met He Leu He Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr Ala 
145 150 155 160 

Thr Ala Thr Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr 
165 170 175 

Ser Ala Gly Gly Leu Leu Glu Gin Ala Ala Ala Val Glu Glu Ala Ser 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Gin Gly Thr Thr Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro He Ser Asn 
225 230 235 240 

Met Val Ser Met Ala Asn Asn His Met Ser Met Thr Asn Ser Gly Val 
245 . 250 255 

Ser Met Thr Asn Thr Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Ala Gin Ala Val Gin Thr Ala Ala Gin Asn Gly Val Arg Ala 
275 280 285 

Met Ser Ser Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly Gly Gly 
290 295 300 

Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val 
305 310 315 320 

Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro Ala Ala Arg 
325 330 335 

Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro Gly 
340 345 350 

Gin Met Leu Gly Gly Leu Pro Val Gly Gin Met Gly Ala Arg Ala Gly 
355 360 365 

Gly Gly Leu Ser Gly Val Leu Arg Val Pro Pro Arg Pro Tyr Val Met 
370 375 380 

Pro His Ser Pro Ala Ala Gly 
385 390 
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(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 



GACGTCAGCA 


CCCGCCGTGC 


AGGGCTGGAG 


CGI GG1CGG1 


rprprp^Tv T>r~ , T , PP 
111 KlPi. 1 <w 1 


bb X v^ririO o i o 


60 


ACGTCCCTCG 


GCGTGTCGCC 


GGCGTGGATG 


CAGACTCGAT 


rT % r , /T^ r Pf^"T"T ,r P 

(jjCCOL, 1 U 1 I 1 


A PTPP A A PT A 


120 


ATTTCGTTGA 


AGTGCCTGCG 


AGGTATAGGA 


CTTCACGA1 J. 


/— /"Tprp tv 7\ m/^ rp tv 


pppttp appp 


IRQ 

J- O <J 


CGTGTTGGGG 


TCGATTTGGC 


CGGACCAGTC 


GTCAtLAALb 


bl IbbLblbb 


bbbb ^-riu Kj v^/ O 


24 0 


GGCGATCAGA 


TCGCTTGACT 


ACCAATCAAT 


/-t rp rp f> TV rp 

CrTGAGCiCC 




PTPPPP.PT A A 


300 


ATGAGGAGGA 


GCACGCGTGT 


CTTTCACTGC 


GCAACCGGAG 


AlCjl lbbbbb 


Lbu'obbv^ x oo 


360 


CGAACTTCGT 


TCCCTGGGGG 


CAACGCTGAA 


GGC 1 AGCAA I 


chcchc p c a p 

bbbbbbbbnb 


PPPTPPPPAP 


420 


GACTGGGGTG 


GTGCCCCCGG 


CTGCCGACGA 


GG 1 G 1 CGC 1 <j 




P AP A ATTPPP 


4 80 


TACGCATGCG 


GCGACGTATC 


AGACGGCCAG 


r""' /"*« a - * 7\ 7\ C* 

CGCCAAGGCC 


bbbb 1 o.tt.1 v_ 


ATP APPAPTT 


54 0 


TGTGACCACG 


CTGGCCACCA 


GCGCTAGTTC 


AlAlbLbbaL 


/\C ^ bnb v-* ^ o 


PPA APPPTPT 


600 


GGTCACCGGC 


TAGCTGACCT 


GACGGTATTC 


GAGCGGAACjLj 


TvrpipTvrpp/^T\7\rj 


TPPTPPATTT 

X <D X \D \Dt\ XXX 


660 


CGGGGCGTTA 


CCACCGGAGA 


TCAACTCCGC 


GAGGA1 GI AC 


CCCCfTT'CCCl 
bbbbbbbbbb 


PTTPPPPPTP 


720 


GCTGGTGGCC 


GCCGCGAAGA 


TGTGGGACAG 


CGTGGCGAGT 


GACCTGTTTT 


CGGCCGCGTC 


780 


GGCGTTTCAG 


TCGGTGGTCT 


GGGGTCTGAC 


GGTGGGGTCG 


TGGATAGGTT 


CGTCGGCGGG 


840 


TCTGATGGCG 


GCGGCGGCCT 


CGCCGTATGT 


GGCGTGGATG 


AGCGTCACCG 


CGGGGCAGGC 


900 


CCAGCTGACC 


GCCGCCCAGG 


TCCGGGTTGC 


TGCGGCGGCC 


TACGAGACAG 


CGTATAGGCT 


960 


GACGGTGCCC 


CCGCCGGTGA 


TCGCCGAGAA 


CCGTACCGAA 


CTGATGACGC 


TGACCGCGAC 


1020 


CAACCTCTTG 


GGGCAAAACA 


CGCCGGCGAT 


CGAGGCCAAT 


CAGGCCGCAT 


AC AG C C AG AT 


1080 


GTGGGGCCAA 


GACGCGGAGG 


CGATGTATGG 


CTACGCCGCC 


ACGGCGGCGA 


CGGCGACCGA 


1140 


GGCGTTGCTG 


CCGTTCGAGG 


ACGCCCCACT 


GATCACCAAC 


CCCGGCGGGC 


TCCTTGAGCA 


1200 


GGCCGTCGCG 


GTCGAGGAGG 


CCATCGACAC 


CGCCGCGGCG 


AACCAGTTGA 


TGAACAATGT 


1260 


GCCCCAAGCG 


CTGCAACAGC 


TGGCCCAGCC 


AGCGCAGGGC 


GTCGTACCTT 


CTTCCAAGCT 


1320 


GGGTGGGCTG 


TGGACGGCGG 


TCTCGCCGCA 


TCTGTCGCCG 


CTCAGCAACG 


TCAGTTCGAT 


1380 


AGCCAACAAC 


CACATGTCGA 


. TGATGGGCAC 


GGGTGTGTCG 


AT G AC C AAC A 


CCTTGCACTC 


1440 
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GATGTTGAAG GGCTTAGCTC CGGCGGCGGC TCAGGCCGTG GAAACCGCGG CGGAAAACGG 1500 

GGTCTGGGCG ATGAGCTCGC TGGGCAGCCA GCTGGGTTCG TCGCTGGGTT CTTCGGGTCT 1560 

GGGCGCTGGG GTGGCCGCCA ACTTGGGTCG GGCGGCCTCG GTCGGTTCGT TGTCGGTGCC 1620 

GCCAGCATGG GCCGCGGCCA ACCAGGCGGT CACCCCGGCG GCGCGGGCGC TGCCGCTGAC 1680 

CAGCCTGACC AGCGCCGCCC AAACCGCCCC CGGACACATG CTGGG 172 5 
(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Val Val Asp Phe Gly Ala Leu Pro Pro Glu lie Asn Ser Ala Arg Met 
15 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Val Gly Ser Trp lie Gly Ser Ser Ala Gly 
50 55 60 

Leu Met Ala Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 80 

Ala Gly Gin Ala Gin Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 ~ 95 

Ala Tyr Glu Thr Ala Tyr Arg Leu Thr Val Pro Pro Pro Val lie Ala 
100 105 110 

Glu Asn Arg Thr Glu Leu Met Thr Leu Thr Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala lie Glu Ala Asn Gin Ala Ala Tyr Ser Gin Met 
130 135 140 

Trp Gly Gin Asp Ala Glu Ala Met Tyr Gly Tyr Ala Ala Thr Ala Ala 
145 150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu lie Thr 
165 170 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala lie 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 
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Gin Gin Leu Ala Gin Pro Ala Gin Gly Val Val Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Thr Ala Val Ser Pro His Leu Ser Pro Leu Ser Asn 
225 " 1 230 235 240 

Val Ser Ser He Ala Asn Asn His Met Ser Met Met Gly Thr Gly Val 
24 5 250 255 

Ser Met Thr Asn Thr Leu His Ser Met Leu Lys Gly Leu Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Glu Asn Gly Val Trp Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 

Glv Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Pro Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arq Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly 
355 

(2) INFORMATION FOR SEQ ID N0:110: 

(i) SEQUENCE CHARACTERISTICS: 
• (A) LENGTH: 3027 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

AGTTCAGTCG AG AAT GAT AC TGACGGGCTG TATCCACGAT GGCTGAGACA ACCGAACCAC 60 

CGTCGGACGC GGGGACATCG CAAGCCGACG CGATGGCGTT GGCCGCCGAA GCCGAAGCCG 120 

CCGAAGCCGA AGCGCTGGCC GCCGCGGCGC GGGCCCGTGC CCGTGCCGCC CGGTTGAAGC 180 

GTGAGGCGCT GGCGATGGCC CCAGCCGAGG ACGAGAACGT CCCCGAGGAT ATGCAGACTG 24 0 

GGAAGACGCC GAAGACTATG AC G AC TAT G A CGACTATGAG GCCGCAGACC AGGAGGCCGC 300 

ACGGTCGGCA TCCTGGCGAC GGCGGTTGCG GGTGCGGTTA CCAAGACTGT CCACGATTGC 3 60 

CATGGCGGCC GCAGTCGTCA TCATCTGCGG CTTCACCGGG CTCAGCGGAT ACATTGTGTG 4 20 

GCAACACCAT GAGGCCACCG AACGCCAGCA GCGCGCCGCG GCGTTCGCCG CCGGAGCCAA 4 80 

GCAAGGTGTC ATCAACATGA CCTCGCTGGA CTTCAACAAG GCCAAAGAAG ACGTCGCGCG 54 0 
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TGTGATCGAC AGCTCCACCG GCGAATTCAG GGATGACTTC CAGCAGCGGG CAGCCGATTT 600 

CACCAAGGTT GTCGAACAGT CCAAAGTGGT CACCGAAGGC ACGGTGAACG CGACAGCCGT 6 60 

CGAATCCATG AACGAGCATT CCGCCGTGGT GCTCGTCGCG GCGACTTCAC GGGTCACCAA 720 

TTCCGCTGGG GCGAAAGACG AACCACGTGC GTGGCGGCTC AAAGTGACCG TGACCGAAGA 7 80 

GGGGGGACAG TACAAGATGT CGAAAGTTGA GTTCGTACCG TGACCGATGA CGTACGCGAC 84 0 

GTCAACACCG AAACCACTGA CGCCACCGAA GTCGCTGAGA TCGACTCAGC CGCAGGCGAA 900 

GCCGGTGATT CGGCGACCGA GGCATTTGAC ACCGACTCTG CAACGGAATC TACCGCGCAG 960 

AAGGGTCAGC GGCACCGTGA CCTGTGGCGA ATGCAGGTTA CCTTGAAACC CGTTCCGGTG 1020 

ATTCTCATCC TGCTCATGTT GATCTCTGGG GGCGCGACGG GATGGCTATA CCTTGAGCAA 1080 

TACGACCCGA TCAGCAGACG GACTCCGGCG CCGCCCGTGC TGCCGTCGCC GCGGCGTCTG 114 0 

ACGGGACAAT CGCGCTGTTG TGTATTCACC CGACACGTCG ACCAAGACTT CGCTACCGCC 12 00 

AGGTCGCACC TCGCCGGCGA TTTCCTGTCC TATACGACCA GTTCACGCAG CAGATCGTGG 12 60 

CTCCGGCGGC CAAACAGAAG TCACTGAAAA CCACCGCCAA GGTGGTGCGC GCGGCCGTGT 1320 

CGGAGCTACA TCCGGATTCG GCCGTCGTTC TGGTTTTTGT CGACCAGAGC ACTACCAGTA 1380 

AGGACAGCCC CAATCCGTCG ATGGCGGCCA GCAGCGTGAT GGTGACCCTA GCCAAGGTCG 14 40 

ACGGCAATTG GCTGATCACC AAGTTCACCC CGGTTTAGGT TGCCGTAGGC GGTCGCCAAG 1500 

TCTGACGGGG GCGCGGGTGG CTGCTCGTGC GAGATACCGG CCGTTCTCCG GACAATCACG 15 60 

GCCCGACCTC AAACAGATCT CGGCCGCTGT CTAATCGGCC GGGTTATTTA AGATTAGTTG 1620 

CCACTGTATT TACCTGATGT TCAGATTGTT CAGCTGGATT TAGCTTCGCG GCAGGGCGGC 1680 

TGGTGCACTT TGCATCTGGG GTTGTGACTA CTTGAGAGAA TTTGACCTGT TGCCGACGTT 17 4 0 

GTTTGCTGTC CATCATTGGT GCTAGTTATG GCCGAGCGGA AGGATTATCG AAGTGGTGGA 18 00 

CTTCGGGGCG TTACCACCGG AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC 18 60 

CTCGCTGGTG GCCGCCGCGA AGATGTGGGA CAGCGTGGCG AGTGACCTGT TTTCGGCCGC 1920 

GTCGGCGTTT CAGTCGGTGG TCTGGGGTCT GACGACGGGA TCGTGGATAG GTTCGTCGGC 198 0 

GGGTCTGATG GTGGCGGCGG CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA 2 04 0 

GGCCGAGCTG ACCGCCGCCC AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG 2100 

GCTGACGGTG CCCCCGCCGG TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC 2160 

GACCAACCTC TTGGGGCAAA ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGGGA 2220 

GATGTGGGCC CAAGACGCCG CCGCGATGTT TGGCTACGCC GCCACGGCGG CGACGGCGAC 2280 

CGAGGCGTTG CTGCCGTTCG AGGACGCCCC ACTGATCACC AACCCCGGCG GGCTCCTTGA 234 0 

GCAGGCCGTC GCGGTCGAGG AGGCCATCGA CACCGCCGCG GCGAACCAGT TGATGAACAA 2 4 00 
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TGTGCCCCAA 


GCGCTGCAAC 


AACTGGCCCA 


GCCCACGAAA 


AGCATCTGGC 


CGTTCGACCA 


2460 


ACTGAGTGAA 


CTCTGGAAAG 


CCATCTCGCC 


GCATCTGTCG 


CCGCTCAGCA 


ACATCGTGTC 


2520 


GATGCTCAAC 


AACCACGTGT 


CGATGACCAA 


CTCGGGTGTG 


TCGATGGCCA 


GCACCTTGCA 


2580 


CTCAATGTTG 


AAGGGCTTTG 


CTCCGGCGGC 


GGCTCAGGCC 


GTGGAAACCG 


CGGCGCAAAA 


2640 


CGGGGTCCAG 


GCGATGAGCT 


CGCTGGGCAG 


CCAGCTGGGT 


TCGTCGCTGG 


GTTCTTCGGG 


2700 


TCTGGGCGCT 


GGGGTGGCCG 


CCAACTTGGG 


TCGGGCGGCC 


TCGGTCGGTT 


CGTTGTCGGT 


2760 


GCCGCAGGCC 


TGGGCCGCGG 


CCAACCAGGC 


GGTCACCCCG 


GCGGCGCGGG 


CGCTGCCGCT 


2820 


GACCAGCCTG 


ACCAGCGCCG 


CCCAAACCGC 


CCCCGGACAC 


ATGCTGGGCG 


GGCTACCGCT 


2880 


GGGGCAACTG 


ACCAATAGCG 


GCGGCGGGTT 


CGGCGGGGTT 


AGCAATGCGT 


TGCGGATGCC 


2940 


GCCGCGGGCG 


TACGTAATGC 


CCCGTGTGCC 


CGCCGCCGGG 


TAACGCCGAT 


CCGCACGCAA 


3000 


TGCGGGCCCT 


CTATGCGGGC 


AGCGATC 








3027 


(2) INFORMATION FOR SEQ ID NO: 111: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Val Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
1 5 1° 15 

Tvr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Thr Gly Ser Trp He Gly Ser Ser Ala Gly 
50 " 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 80 

Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 . 95 

Ala Tvr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val He Ala 
100 105 HO 

Glu Asn Arg Ala Glu Leu Met He Leu He Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
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130 



135 



140 



Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Thr Ala Ala 
145 150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu lie Thr 
165 170 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala lie 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Lys Ser lie Trp Pro Phe Asp Gin Leu 
210 215 220 

Ser Glu Leu Trp Lys Ala lie Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

lie Val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Ala Ser Thr Leu His Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Gin Asn Gly Val Gin Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 



Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly Gly Leu Pro Leu Gly Gin Leu Thr Asn 
355 360 365 

Ser Gly Gly Gly Phe Gly Gly Val Ser Asn Ala Leu Arg Met Pro Pro 
370 375 380 

Arg Ala Tyr Val Met Pro Arg Val Pro Ala Ala Gly 



325 



330 



335 



385 



390 



395 



(2) INFORMATION FOR SEQ ID NO: 112: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1616 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 112: 
CATCGGAGGG AGTGATCACC ATGCTGTGGC ACGCAATGCC ACCGGAGTAA ATACCGCACG 60 

GCTGATGGCC GGCGCGGGTC CGGCTCCAAT GCTTGCGGCG GCCGCGGGAT GGCAGACGCT 120 

TTCGGCGGCT CTGGACGCTC AGGCCGTCGA GTTGACCGCG CGCCTGAACT CTCTGGGAGA 180 

AGCCTGGACT GGAGGTGGCA GCGACAAGGC GCTTGCGGCT GCAACGCCGA TGGTGGTCTG 24 0 

GCTACAAACC GCGTCAACAC AGGCCAAGAC CCGTGCGATG CAGGCGACGG CGCAAGCCGC 300 

GGCATACACC CAGGCCATGG CCACGACGCC GTCGCTGCCG GAGATCGCCG CCAACCACAT 360 

CACCCAGGCC GTCCTTACGG CCACCAACTT CTTCGGTATC AACACGATCC CGATCGCGTT 4 20 

GACCGAGATG GATTATTTCA TCCGTATGTG GAACCAGGCA GCCCTGGCAA TGGAGGTCTA 4 80 

CCAGGCCGAG ACCGCGGTTA ACACGCTTTT CGAGAAGCTC GAGCCGATGG CGTCGATCCT 54 0 

TGATCCCGGC GCGAGCCAGA GCACGACGAA CCCGATCTTC GGAATGCCCT CCCCTGGCAG 600 

CTCAACACCG GTTGGCCAGT TGCCGCCGGC GGCTACCCAG ACCCTCGGCC AACTGGGTGA 660 

GATGAGCGGC CCGATGCAGC AGCTGACCCA GCCGCTGCAG CAGGTGACGT CGTTGTTCAG 720 

CCAGGTGGGC GGCACCGGCG GCGGCAACCC AGCCGACGAG GAAGCCGCGC AGATGGGCCT 780 

GCTCGGCACC AGTCCGCTGT CGAACCATCC GCTGGCTGGT GGATCAGGCC CCAGCGCGGG 84 0 
CGCGGGCCTG CTGCGCGCGG AGTCGCTACC TGGCGCAGGT GGGTCGTTGA CCCGCACGCC 900 
GCTGATGTCT CAGCTGATCG AAAAGCCGGT TGCCCCCTCG GTGATGCCGG CGGCTGCTGC 960 

CGGATCGTCG GCGACGGGTG GCGCCGCTCC GGTGGGTGCG GGAGCGATGG GCCAGGGTGC 1020 

GCAATCCGGC GGCTCCACCA GGCCGGGTCT GGTCGCGCCG GCACCGCTCG CGCAGGAGCG 1080 

TGAAGAAGAC GACGAGGACG ACTGGGACGA AGAGGACGAC TGGTGAGCTC CCGTAATGAC 114 0 

AACAGACTTC CCGGCCACCC GGGCCGGAAG ACTTGCCAAC ATTTTGGCGA GGAAGGTAAA 1200 

GAGAGAAAGT AGTCCAGCAT G G C AG AG AT G AAGACCGATG CCGCTACCCT CGCGCAGGAG 1260 

GCAGGTAATT TCGAGCGGAT CTCCGGCGAC CTGAAAACCC AGATCGACCA GGTGGAGTCG 1320 

ACGGCAGGTT CGTTGCAGGG CCAGTGGCGC GGCGCGGCGG GGACGGCCGC CCAGGCCGCG 138 0 

GTGGTGCGCT TCCAAGAAGC AGCCAATAAG CAGAAGCAGG AACTCGACGA GATCTCGACG 14 4 0 

AATATTCGTC AGGCCGGCGT CCAATACTCG AGGGCCGACG AGGAGCAGCA GCAGGCGCTG 1500 

TCCTCGCAAA TGGGCTTCTG ACCCGCTAAT ACGAAAAGAA ACGGAGCAAA AAC AT G AC AG 1560 

AG C AG C AG T G GAATTTCGCG GGTATCGAGG CCGCGGCAAG CGCAATCCAG GGAAAT 1616 
(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

CTAGTGGATG GGACCATGGC CATTTTCTGC AGTCTCACTG CCTTCTGTGT TGACATTTTG 60 

GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA TATCGTCCGG 120 

AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TCGGCCGCCA TGACAACCTC 180 

TCAGAGTGCG CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGAA GGTCGAACTC 24 0 

GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGCGT TGCCCTATGC GAACATCCCA 300 

GTGACGTTGC CTTCGGTCGA AGCCATTGCC TGACCGGCTT CGCTGATCGT CCGCGCCAGG 3 60 

TTCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATTTTTGCTG GACACCCTGG 4 20 

TACGCCTCCG AA 4 32 
(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

Met Leu Trp His Ala Met Pro Pro Glu Xaa Asn Thr Ala Arg Leu Met 
15 10 15 

Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala Ala Ala Gly Trp Gin 
20 25 30 

Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu Thr Ala Arg 
35 40 45 

Leu Asn Ser Leu Gly Glu Ala Trp Thr Gly Gly Gly Ser Asp Lys Ala 
50 55 ~ 60 

Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu Gin Thr Ala Ser Thr 
65 70 75 80 

Gin Ala Lys Thr Arg Ala Met Gin Ala Thr Ala Gin Ala Ala Ala Tyr 
85 90- 95 

Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro Glu lie Ala Ala Asn 
100 105 110 

His lie Thr Gin Ala Val Leu Thr Ala Thr Asn Phe Phe Gly lie Asn 
115 120 125 



BNSDOCID: <WO 9816646A2_I_> 



WO 98/16646 



138 



PCT/US97/18293 



Thr lie Pro He Ala Leu Thr Glu Met Asp Tyr Phe He Arg Met Trp 
130 135 140 



Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu Thr Ala Val 
Asn Thr Leu Phe Glu Lys Leu Glu Pro Met Ala Ser He Leu Asp Pro 



145 150 

■, r»~„ G i u Ly S Leu Glu Pro 

165 l^O 175 

Gly Ala Ser Gin Ser Thr Thr Asn Pro He Phe Gly Met Pro Ser Pro 
180 185 190 

Glv Ser Ser Thr Pro Val Gly Gin Leu Pro Pro Ala Ala Thr Gin Thr 
!95 200 205 

Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin Gin Leu Thr Gin 
21 0 215 220 

Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Val Gly Gly Thr Gly 
225 230 235 240 

Gly Gly Asn Pro Ala Asp Glu Glu Ala Ala Gin Met Gly Leu Leu Gly 
3 245 250 255 

Thr Ser Pro Leu Ser Asn His Pro Leu Ala Gly Gly Ser Gly Pro Ser 
260 265 270 

Ala Gly Ala Gly Leu Leu Arg Ala Glu Ser Leu Pro Gly Ala Gly Gly 
275 " 280 285 

Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu He Glu Lys Pro Val 
290 295 300 

Ala Pro Ser Val Met Pro Ala Ala Ala Ala Gly Ser Ser Ala Thr Gly 
305 310 315 320 

Gly Ala Ala Pro Val Gly Ala Gly Ala Met Gly Gin Gly Ala Gin Ser 
325 330 335 

Gly Gly Ser Thr Arg Pro Gly Leu Val Ala Pro Ala Pro Leu Ala Gin 
340 345 350 

Glu Arg Glu Glu Asp Asp Glu Asp Asp Trp Asp Glu Glu Asp Asp Trp 
355 360 365 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 



Met Ala Glu Met Lys Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly 
1 ^ 1° 15 
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Asn Phe Glu Arg 
20 



lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val 
25 30 



Glu Ser Thr Ala 
35 



Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly 
40 45 



Thr Ala Ala Gin 
50 



Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys 
55 60 



Gin Lys Gin Glu 
65 



Leu Asp Glu lie Ser Thr Asn lie Arg Gin Ala Gly 
70 75 " 80 



Val Gin Tyr Ser 



Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser 
85 90 95 



Gin Met Gly Phe 
100 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
' (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

GATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGACGGCAG GTTCGTTGCA 60 

GGGCCAGTGG CGCGGCGCGG CGGGGACGGC CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 120 

AGCAGCCAAT AAGCAGAAGC AGGAACTCGA CGAGATCTCG ACGAATATTC GTCAGGCCGG 180 

CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGCG CTGTCCTCGC AAATGGGCTT 24 0 

CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAACATGA CAGAGCAGCA GTGGAATTTC 30 0 

GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 3 60 

CTTGACGAGG GGAAGCAGTC CCTGACCAAG CTCGCA 396 
(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val Glu Ser Thr Ala 
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l 



5 



10 



15 



Gly Ser Leu Gin Gly Gin Trp Arg Gly 
20 25 



Ala Ala Gly Thr Ala Ala Gin 
30 



Ala Ala Val Val Arg Phe Gin Glu Ala 
35 40 



Ala Asn Lys Gin Lys Gin Glu 
4 5 



Leu Asp Glu lie Ser Thr Asn lie Arg Gin Ala Gly Val Gin Tyr Ser 
50 55 60 

Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
65 * 70 75 80 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

GTGGATCCCG ATCCCGTGTT TCGCTATTCT ACGCGAACTC GGCGTTGCCC TATGCGAACA 60 

TCCCAGTGAC GTTGCCTTCG GTCGAAGCCA TTGCCTGACC GGCTTCGCTG ATCGTCCGCG 120 

CCAGGTTCTG CAGCGCGTTG TTCAGCTCGG TAGCCGTGGC GTCCCATTTT TGCTGGACAC 180 

CCTGGTACGC CTCCGAACCG CTACCGCCCC AGGCCGCTGC GAGCTTGGTC AGGGACTGCT 24 0 

TCCCCTCGTC AAGGAGGGAA TGAATGGACG TGACATTTCC CTGGATTGCG CTTGCCGCGG 300 

CCTCGATACC CGCGAAATTC CACTGCTGCT CTGTCATGTT TTTGCTCCGT TTCTTTTCGT 360 

ATTAGCGGGT CAGAAGCCCA TTTGCGA 387 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

CGGCACGAGG ATCTCGGTTG GCCCAACGGC GCTGGCGAGG GCTCCGTTCC GGGGGCGAGC 60 

TGCGCGCCGG ATGCTTCCTC TGCCCGCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACC 120 

TTCCCGACGT TTCGTTCGGT GTCTGTGCGA TAGCGGTGAC CCCGGCGCGC ACGTCGGGAG 180 
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TGTTGGGGGG CAGGCCGGGT CGGTGGTTCG GCCGGGGACG CAGACGGTCT GGACGGAACG 24 0 

GGCGGGGGTT CGCCGATTGG CATCTTTGCC CA 27 2 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Asp Pro Val Asp Ala Val lie Asn Thr Thr Cys Asn Tyr Gly Gin Val 
1 5 10 15 

Val Ala Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Ala Val'Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
15 10 15 

Glu Gly Arg 
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(2) INFORMATION FOR SEQ ID NO: 123: 

(x) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Tvr Tvr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124 : 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Ala Glu Glu Ser lie Ser Thr Xaa Glu Xaa He Val Pro 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro 
15 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
1 5 10 15 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Asp Pro Pro Asp Pro His Gin Xaa Asp Met Thr Lys Gly Tyr Tyr Pro 
1 5 10 " 15 
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Gly Gly Arg Arg Xaa Phe 
20 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Asp Pro Gly Tyr Thr Pro Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ix) F ^ U q^ er INFORMATION : /note= "The Second Residue Can Be Either a 
Pro or Thr" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Xaa Xaa Gly Phe Thr Gly Pro Gin Phe Tyr 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ix) FE ^ R ^ TRER INFORMATION : /note= "The Third Residue Can Be Either a 
Gin or Leu" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Xaa Pro Xaa Val Thr Ala Tyr Ala Gly 
1 5 

(2) INFORMATION FOR SEQ. ID NO: 133: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

Xaa Xaa Xaa Glu Lys Pro Phe Leu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Xaa Asp Ser Glu Lys Ser Ala Thr lie Lys Val Thr Asp Ala Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Ala Gly Asp Thr Xaa lie Tyr lie Val Gly Asn Leu Thr Ala Asp 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Ala Pro Glu Ser Gly Ala Gly Leu Gly Gly Thr Val Gin Ala Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Xaa Tvr He Ala Tyr Xaa Thr Thr Ala Gly He Val Pro Gly Lys He 
15 10 15 

Asn Val His Leu Val 
20 

(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

GCAACGCTGT CGTGGCCTTT GCGGTGATCG GTTTCGCCTC GCTGGCGGTG GCGGTGGCGG 60 

TCACCATCCG ACCGACCGCG GCCTCAAAAC CGGTAGAGGG ACACCAAAAC GCCCAGCCAG 120 

GGAAGTTCAT GCCGTTGTTG CCGACGCAAC AGCAGGCGCC GGTCCCGCCG CCTCCGCCCG 180 

ATGATCCCAC CGCTGGATTC CAGGGCGGCA CCATTCCGGC TGTACAGAAC GTGGTGCCGC 24 0 

GGCCGGGTAC CTCACCCGGG GTGGGTGGGA CGCCGGCTTC GCCTGCGCCG GAAGCGCCGG 300 

CCGTGCCCGG TGTTGTGCCT GCCCCGGTGC CAATCCCGGT CCCGATCATC ATTCCCCCGT 360 

TCCCGGGTTG GCAGCCTGGA ATGCCGACCA TCCCCACCGC ACCGCCGACG ACGCCGGTGA 4 20 

CCACGTCGGC GACGACGCCG CCGACCACGC CGCCGACCAC GCCGGTGACC ACGCCGCCAA 4 80 

CGACGCCGCC GACCACGCCG GTGACCACGC CGCCAACGAC GCCGCCGACC ACGCCGGTGA 54 0 

CCACGCCACC AACGACCGTC GCCCCGACGA CCGTCGCCCC GACGACGGTC GCTCCGACCA 600 

CCGTCGCCCC GACCACGGTC GCTCCAGCCA CCGCCACGCC GACGACCGTC GCTCCGCAGC 660 
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CGACGCAGCA GCCCACGCAA CAACCAACCC AACAGATGCC AACCCAGCAG CAGACCGTGG 7 20 

CCCCGCAGAC GGTGGCGCCG GCTCCGCAGC CGCCGTCCGG TGGCCGCAAC GGCAGCGGCG 7 80 

GGGGCGACTT ATTCGGCGGG TTCTGATCAC GGTCGCGGGT TCACTACGGT CGGAGGACAT 84 0 

GGCCGGTGAT GCGGTGACGG TGGTGCTGCC CTGTCTCAAC GA 8 82 

(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 
CCATCAACCA 
CGGTGCCTCC 
CTAGGGCGCT 
TGGCGCCGTT 
CCAGCCACCC 
TCGTGCCCGT 
ACGCGGCCTG 
GGAGTGCCGC 
CACTTCCAGA 
AGCGGCTAGC 
CCCCCCGAAG 
CCAGGCCAAT 
ACNGTGGTCG 
GGCGAGGGCA 

(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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ACCGCTCGCG 


CCGCCCGCGC 


CGCCGGATCC 


GCCGTCGCCG 


CCACGCCCGC 


60 


GGTGCCCCCG 


TTGCCGCCGT 


CGCCGCCGTC 


GCCGCCGACC 


GGCTGGGTGC 


120 


GTTACCGCCC 


TGGTTGGCGG 


GGACGCCGCC 


GGCACCACCG 


GTACCGCCGA 


180 


GCCGCCGGCG 


GCACCGTTGC 


CACCGTTGCC 


ACCGTTGCCA 


CCGTTGCCGA 


240 


GCCGCGACCA 


CCGGCACCGC 


CGGCGCCGCC 


CGCACCGCCG 


GCGTGCCCGT 


300 


ACCGCCGGCA 


CCGCCGTTGC 


CGCCGTCACC 


GCCGACGGAA 


CTACCGGCGG 


360 


CCCGCCGGCG 


CCGCCCGCAC 


CGCCATTGGC 


ACCGCCGTCA 


CCGCCGGCTG 


420 


GATTAGGGCA 


CTGACCGGCG 


CAACCAGCGC 


AAGTACTCTC 


GGTCACCGAG 


480 


CGACACCACA 


GCACGGGGTT 


GTCGGCGGAC 


TGGGTGAAAT 


GGCAGCCGAT 


540 


TGTCGGCTGC 


GGTCAACCTC 


GATCATGATG 


TCGAGGTGAC 


CGTGACCGCG 


600 


GAGGCGCTGA 


ACTCGGCGTT 


GAGCCGATCG 


GCGATCGGTT 


GGGGCAGTGC 


660 


ACGGGGATAC 


CGGGTGTCNA 


AGCCGCCGCG 


AGCGCAGCTT 


CGGTTGCGCG 


720 


GGGTGGCCTG 


TTACGCCGTT 


GTCNTCGAAC 


ACGAGTAGCA 


GGTCTGCTCC 


780 


TCCACCACGC 


GTTGCGTCAG 


CTCGT 






815 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 



ACCAGCCGCC 


GGCTGAGGTC 


TCAGATCAGA 


GAGTCTCCGG 


ACTCACCGGG 


GCGGTTCAGC 


bU 


CTTCTCCCAG 


AACAACTGCT 


GAAGATCCTC 


GCCCGCGAAA 


CAGGCGCTGA 


TTTGACGCTC 


12 0 


TATGACCGGT 


TGAACGACGA 


GATCATCCGG 


CAGATTGATA 


TGGCACCGCT 


GGGCTAACAG 


180 


GTGCGCAAGA 


TGGTGCAGCT 


GTATGTCTCG 


GACTCCGTGT 


CGCGGATCAG 


CTTTGCCGAC 


24 0 


GGCCGGGTGA 


TCGTGTGGAG 


CGAGGAGCTC 


GGCGAGAGCC 


AGTATCCGAT 


CGAGACGCTG 


300 


GACGGCATCA 


CGCTGTTTGG 


GCGGCCGACG 


ATGACAACGC 


CCTTCATCGT 


TGAGATGCTC 


360 


AAGCGTGAGC 


GCGACATCCA 


GCTCTTCACG 


ACCGACGGCC 


ACTACCAGGG 


CCGGATCTCA 


420 


ACACCCGACG 


TGTCATACGC 


GCCGCGGCTC 


CGTCAGCAAG 


TTCACCGCAC 


CGACGATCCT 


480 


GCGTTCTGCC 


TGTCGTTAAG 


CAAGCGGATC 


GTGTCGAGGA 


AGATCCTGAA 


TCAGCAGGCC 


54 0 


TTGATTCGGG 


CACACACGTC 


GGGGCAAGAC 


GTTGCTGAGA 


GCATCCGCAC 


GATGAAGCAC 


600 


TCGCTGGCCT 


GGGTCGATCG 


ATCGGGCTCC 


CTGGCGGAGT 


TGAACGGGTT 


CGAGGGAAAT 


660 


GCCGCAAAGG 


CATACTTCAC 


CGCGCTGGGG 


CATCTCGTCC 


CGCAGGAGTT 


CGCATTCCAG 


720 


GGCCGCTCGA 


CTCGGCCGCC 


GTTGGACGCC 


TTCAACTCGA 


TGGTCAGCCT 


CGGCTATTCG 


780 


CTGCTGTACA 


AGAACATCAT 


AGGGGCGATC 


GAGCGTCACA 


GCCTGAACGC 


GTATATCGGT 


840 


TTCCTACACC 


AGGATTCACG 


AGGGCACGCA 


ACGTCTCGTG 


CCGAATTCGG 


CACGAGCTCC 


900 


GCTGAAACCG 


CTGGCCGGCT 


GCTCAGTGCC 


CGTACGTAAT 


CCGCTGCGCC 


CAGGCCGGCC 


960 


CGCCGGCCGA ATACCAGCAG 


ATCGGACAGC 


GAATTGCCGC 


CCAGCCGGTT 


GGAGCCGTGC 


1020 


ATACCGCCGG 


CACACTCACC 


GGCAGCGAAC 


AGGCCTGGCA 


CCGTGGCGGC 


GCCGGTGTCC 


1080 


GCGTCTACTT 


CGACACCGCC 


CATCACGTAG 


TGACACGTCG 


GCCCGACTTC 


CATTGCCTGC 


1140 


GTTCGGCACG 


i AG 










1152 



(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
CTCGTGCCGA TTCGGCAGGG TGTACTTGCC GGTGGTGTAN GCCGCATGAG TGCCGACGAC 
CAGCAATGCG GCAACAGCAC GGATCCCGGT CAACGACGCC ACCCGGTCCA CGTGGGCGAT 
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CCGCTCGAGT CCGCCCTGGG CGGCTCTTTC CTTGGGCAGG GTCATCCGAC GTGTTTCCGC 180 

CGTGGTTTGC CGCCATTATG CCGGCGCGCC GCGTCGGGCG GCCGGTATGG CCGAANGTCG 24 0 

ATCAGCACAC CCGAGATACG GGTCTGTGCA AGCTTTTTGA GCGTCGCGCG GGGCAGCTTC 300 

GCCGGCAATT CTACTAGCGA GAAGTCTGGC CCGATACGGA TCTGACCGAA GTCGCTGCGG 360 

TGCAGCCCAC CCTCATTGGC GATGGCGCCG ACGATGGCGC CTGGACCGAT CTTGTGCCGC 4 20 

TTGCCGACGG CGACGCGGTA GGTGGTCAAG TCCGGTCTAC GCTTGGGCCT TTGCGGACGG 4 80 

TCCCGACGCT GGTCGCGGTT GCGCCGCGAA AGCGGCGGGT CGGGTGCCAT CAGGAATGCC 54 0 

TCACCGCCGC GGCACTGCAC GGCCAGTGCC GCGGCGATGT CAGCCATCGG GACATCATGC 600 

TCGCGTTCAT ACTCCTCGAC CAGTCGGCGG AACAGCTCGA TTCCCGGACC GCCCA 655 
(2) INFORMATION FOR SEQ ID NO: 14 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:142: 

Asn Ala Val Val Ala Phe Ala Val lie Gly Phe Ala Ser Leu Ala Val 
15 10 15 

Ala Val Ala Val Thr lie Arg Pro Thr Ala Ala Ser Lys Pro Val Glu 
20 25 30 

Gly His Gin Asn Ala Gin Pro Gly Lys Phe Met Pro Leu Leu Pro Thr 
35 40 45 

Gin Gin Gin Ala Pro Val Pro Pro Pro Pro Pro Asp Asp Pro Thr Ala 
50 55 60 

Gly Phe Gin Gly Gly Thr lie Pro Ala Val Gin Asn Val Val Pro Arg 
'65 70 75 80 

Pro Gly Thr Ser Pro Gly Val Gly Gly Thr Pro Ala Ser Pro Ala Pro 
85 90 95 

Glu Ala Pro Ala Val Pro Gly Val Val Pro Ala Pro Val Pro lie Pro 
100 105 HO 

Val Pro lie lie lie Pro Pro Phe Pro Gly Trp Gin Pro Gly Met Pro 
115 120 125 

Thr lie Pro Thr Ala Pro Pro Thr Thr Pro Val Thr Thr Ser Ala Thr 
130 135 140 

Thr Pro Pro Thr Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr 
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150 155 160 



Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr Thr Pro Pro Thr 
165 170 17 5 

Thr Pro Val Thr Thr Pro Pro Thr Thr Val Ala Pro Thr Thr Val Ala 
180 185 190 

Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro 
195 200 205 

Ala Thr Ala Thr Pro Thr Thr Val Ala Pro Gin Pro Thr Gin Gin Pro 
210 215 220 

Thr Gin Gin Pro Thr Gin Gin Met Pro Thr Gin Gin Gin Thr Val Ala 
225 230 235 240 

Pro Gin Thr Val Ala Pro Ala Pro Gin Pro Pro Ser Gly Gly Arg Asn 
245 250 255 

Gly Ser Gly Gly Gly Asp Leu Phe Gly Gly Phe 
260 " 265 

(2) INFORMATION FOR SEQ ID NO: 14 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

He Asn Gin Pro Leu Ala Pro Pro Ala Pro Pro Asp Pro Pro Ser Pro 
!■ 5 10 15 

Pro Arq Pro Pro Val Pro Pro Val Pro Pro Leu Pro Pro Ser Pro Pro 
20 25 30 

Ser Pro Pro Thr Gly Trp Val Pro Arg Ala Leu Leu Pro Pro Trp Leu 
35 4 0 4 5 

Ala Gly Thr Pro Pro Ala Pro Pro Val Pro Pro Met Ala Pro Leu Pro 
50 55 60 

Pro Ala Ala Pro Leu Pro Pro Leu Pro Pro Leu Pro Pro Leu Pro Thr 
65 70 75 80 

Ser His Pro Pro Arg Pro Pro Ala Pro Pro Ala Pro Pro Ala Pro Pro 
85 90 95 

Ala Cys Pro Phe Val Pro Val Pro Pro Ala Pro Pro Leu Pro Pro Ser 
100 105 HO 

Pro Pro Thr Glu Leu Pro Ala Asp Ala Ala Cys Pro Pro Ala Pro Pro 
115 120 125 
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Ala Pro Pro Leu Ala Pro Pro Ser Pro Pro Ala Gly Ser Ala Ala lie 
130 135 140 

Arg Ala Leu Thr Gly Ala Thr Ser Ala Ser Thr Leu Gly His Arg Ala 
145 150 155 160 

Leu Pro Asp Asp Thr Thr Ala Arq Gly Cys Arg Arg Thr Gly 
165 1*70 

(2) INFORMATION FOR SEQ ID NO: 14 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Gin Pro Pro Ala Glu Val Ser Asp Gin Arg Val Ser Gly Leu Thr Gly 
1.5 10 " 15 

Ala Val Gin Pro Ser Pro Arg Thr Thr Ala Glu Asp Pro Arg Pro Arg 
20 25 30 

Asn Arg Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 14 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Arg Ala Asp Ser Ala Gly Cys Thr Cys Arg Trp Cys Xaa Pro His Glu 
1 5 10 " 15 

Cys Arg Arg Pro Ala Met Arg Gin Gin His Gly Ser Arg Ser Thr Thr 
20 25 30 

Pro Pro Gly Pro Arg Gly Arg Ser Ala Arg Val Arg Pro Gly Arg Leu 
35 4 0 4 5 

Phe Pro Trp Ala Gly Ser Ser Asp Val Phe Pro Pro Trp Phe Ala Ala 
50 55 60 

lie Met Pro Ala Arg Arg Val Gly Arg Pro Val Trp Pro Xaa Val Asp 
65 70 75 80 
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Gin His Thr Arg Asp Thr Gly Leu Cys Lys Leu Phe Glu Arg Arg Ala 
85 90 95 

Gly Gin Leu Arg Arg Gin Phe Tyr 
100 

(2) INFORMATION FOR SEQ ID NO: 14 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 6: 
GGATCCATAT GGGCCATCAT CATCATCATC ACGTGATCGA CATCATCGGG ACC 53 
(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:147: 
CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 
(2) INFORMATION FOR SEQ ID NO: 14 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



3NSDOCID: <WO 9816646A2_I_> 



WO 98/16646 



153 



PCI7US97/18293 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 31 
(2) INFORMATION FOR SEQ ID NO: 14 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 9: 
CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 31 
(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 
GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 33 
(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 
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GAGAGAATTC TCAGAAGCCC ATTTGCGAGG ACA 
(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 152.. 1273 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 120 

GCGGAAATTG AAGAGCACAG AAAGGTATGG C GTG AAA ATT CGT TTG CAT ACG 17 2 

Val Lys He Arg Leu His Thr 
1 5 

CTG TTG GCC GTG TTG ACC GCT GCG CCG CTG CTG CTA GCA GCG GCG GGC 220 
Leu Leu Ala Val Leu Thr Ala Ala Pro Leu Leu Leu Ala Ala Ala Gly 
10 15 20 

TGT GGC TCG AAA CCA CCG AGC GGT TCG CCT GAA ACG GGC GCC GGC GCC 2 68 

Cys Gly Ser Lys Pro Pro Ser Gly Ser Pro Glu Thr Gly Ala Gly Ala 
2 5 30 35 

GGT ACT GTC GCG ACT ACC CCC GCG TCG TCG CCG GTG ACG TTG GCG GAG 316 
Gly Thr Val Ala Thr Thr Pro Ala Ser Ser Pro Val Thr Leu Ala Glu 
40 45 50 55 

ACC GGT AGC ACG CTG CTC TAC CCG CTG TTC AAC CTG TGG GGT CCG GCC 364 
Thr Gly Ser Thr Leu Leu Tyr Pro Leu Phe Asn Leu Trp Gly Pro Ala 
60 65 70 

TTT CAC GAG AGG TAT CCG AAC GTC ACG ATC ACC GCT CAG GGC ACC GGT 412 
Phe His Glu Arg Tyr Pro Asn Val Thr He Thr Ala Gin Gly Thr Gly 
75 80 85 



TCT GGT GCC GGG ATC GCG CAG GCC GCC GCC GGG ACG GTC AAC ATT GGG 
Ser Gly Ala Gly He Ala Gin Ala Ala Ala Gly Thr Val Asn He Gly 
90 95 100 



460 



GCC TCC GAC GCC TAT CTG TCG GAA GGT GAT ATG GCC GCG CAC AAG GGG 508 
Ala Ser Asp Ala Tyr Leu Ser Glu Gly Asp Met Ala Ala His Lys Gly 
105 ^ HO H5 

CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT CAG CAG GTC AAC TAC AAC 55 6 

Leu Met Asn He Ala Leu Ala He Ser Ala Gin Gin Val Asn Tyr Asn 
120 125 130 135 
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CTG CCC GGA GTG AGC GAG CAC CTC AAG CTG AAC GGA AAA GTC CTG GCG 604 

Leu Pro Gly Val Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 
140 145 150 

GCC ATG TAC CAG GGC ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 652 

Ala Met Tyr Gin Gly Thr lie Lys Thr Trp Asp Asp Pro Gin He Ala 
155 160 165 

GCG CTC AAC CCC GGC GTG AAC CTG CCC GGC ACC GCG GTA GTT CCG CTG 7 00 

Ala Leu Asn Pro Gly Val Asn Leu Pro Gly Thr Ala Val Val Pro Leu 

170 175 180 

CAC CGC TCC GAC GGG TCC GGT GAC ACC TTC TTG TTC ACC CAG TAC CTG 74 8 

His Arg Ser Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Gin Tyr Leu 
185 190 195 

TCC AAG CAA GAT CCC GAG GGC TGG GGC AAG TCG CCC GGC TTC GGC ACC 7 96 

Ser Lys Gin Asp Pro Glu Gly Trp Gly Lys Ser Pro Gly Phe Gly Thr 
200 205 210 215 

ACC GTC GAC TTC CCG GCG GTG CCG GGT GCG CTG GGT GAG AAC GGC AAC 84 4 

Thr Val Asp Phe Pro Ala Val Pro Gly Ala Leu Gly Glu Asn Gly Asn 
220 225 230 

GGC GGC ATG GTG ACC GGT TGC GCC GAG ACA CCG GGC TGC GTG GCC TAT 8 92 

Gly Gly Met Val Thr Gly Cys Ala Glu Thr Pro Gly Cys Val Ala Tyr 
235 240 245 

ATC GGC ATC AGC TTC CTC GAC CAG GCC AGT CAA CGG GGA CTC GGC GAG 94 0 

He Gly He Ser Phe Leu Asp Gin Ala Ser Gin Arg Gly Leu Gly Glu 

250 255 260 

GCC CAA CTA GGC AAT AGC TCT GGC AAT TTC TTG TTG CCC GAC GCG CAA 988 

Ala Gin Leu Gly Asn Ser Ser Gly Asn Phe Leu Leu Pro Asp Ala Gin 
265 270 275 

AGC ATT CAG GCC GCG GCG GCT GGC TTC GCA TCG AAA ACC CCG GCG AAC 1036 

Ser He Gin Ala Ala Ala Ala Gly Phe Ala Ser Lys Thr Pro Ala Asn 
280 285 290 295 

CAG GCG ATT TCG ATG ATC GAC GGG CCC GCC CCG GAC GGC TAC CCG ATC 108 4 

Gin Ala He Ser Met He Asp Gly Pro Ala Pro Asp Gly Tyr Pro He 
300 305 ' 310 

ATC AAC TAC GAG TAC GCC ATC GTC AAC AAC CGG CAA AAG GAC GCC GCC 1132 

He Asn Tyr Glu Tyr Ala lie Val Asn Asn Arg Gin Lys Asp Ala Ala 
315 320 325 

ACC GCG CAG ACC TTG CAG GCA TTT CTG CAC TGG GCG ATC ACC GAC GGC 1180 

Thr Ala Gin Thr Leu Gin Ala Phe Leu His Trp Ala He Thr Asp Gly 

330 335 340 

AAC AAG GCC TCG TTC CTC GAC CAG GTT CAT TTC CAG CCG CTG CCG CCC 1228 

Asn Lys Ala Ser Phe Leu Asp Gin Val His Phe Gin Pro Leu Pro Pro 
345 350 355 

GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC 12 73 

Ala Val Val Lys Leu Ser Asp Ala Leu He Ala Thr lie Ser Ser 
360 365 370 

TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG CCATCGGGCT GCTTTGCGGA 1333 
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GCATGCTGGC 


CCGTGCCGGT 


GAAGTCGGCC 


GCGCTGGCCC 


GGCCATCCGG 


TGGTTGGGTG 


1393 


GGATAGGTGC 


GGTGATCCCG 


CTGCTTGCGC 


TGGTCTTGGT 


GCTGGTGGTG 


CTGGTCATCG 


14 53 


AGGCGATGGG 


TGCGATCAGG 


CTCAACGGGT 


TGCATTTCTT 


CACCGCCACC 


GAATGGAATC 


1513 


CAGGCAACAC 


CTACGGCGAA 


ACCGTTGTCA 


CCGACGCGTC 


GCCCATCCGG 


TCGGCGCCTA 


1573 


CTACGGGGCG 


TTGCCGCTGA 


TCGTCGGGAC 


GCTGGCGACC 


TCGGCAATCG 


CCCTGATCAT 


1633 






GAGCGGCGCT 


GGTGATCGTG 


GAACGGCTGC 


CGAAACGGTT 


1693 


GGCCGAGGCT 


GTGGGAATAG 


TCCTGGAATT 


GCTCGCCGGA 


ATCCCCAGCG 


TGGTCGTCGG 


1753 


TTTGTGGGGG 


GCAATGACGT 


TCGGGCCGTT 


CATCGCTCAT 


CACATCGCTC 


CGGTGATCGC 


1813 


TCACAACGCT 


CCCGATGTGC 


CGGTGCTGAA 


CTACTTGCGC 


GGCGACCCGG 


GCAACGGGGA 


1873 


GGGCATGTTG 


GTGTCCGGTC 


TGGTGTTGGC 


GGTGATGGTC 


GTTCCCATTA 


TCGCCACCAC 


1933 


CACTCATGAC 


CTGTTCCGGC 


AGGTGCCGGT 


GTTGCCCCGG 


GAGGGCGCGA 


TCGGGAATTC 


1993 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Val Lvs He Arq Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 80 

He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 HO 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 - 135 140 
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Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr lie Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin lie Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 - 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Pro Gly Cys Val Ala Tyr lie Gly lie Ser Phe Leu Asp Gin Ala 
245 " 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 ' 270 

Phe Leu Leu Pro Asp Ala Gin Ser lie Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala lie Ser Met lie Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro lie lie Asn Tyr Glu Tyr Ala lie Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala lie Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin' Val 
340 345 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 * 365 

lie Ala Thr He Ser Ser 
370 

(2) INFORMATION FOR. SEQ ID NO:154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xij SEQUENCE DESCRIPTION: SEQ ID NO: 154: 
TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 
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AGCATGCGGA 


AACCGCCCGA 


TACGTCGCCG 


GACTGTCGGG 


GGACGTCAAG 


GACGCCAAGC 


120 


GCGGAAATTG 


AAGAGCACAG 


AAAGGTATGG 


CGTGAAAATT 


CGTTTGCATA 


CGCTGTTGGC 


180 


CGTGTTGACC 


GCTGCGCCGC 


TGCTGCTAGC 


AGCGGCGGGC 


TGTGGCTCGA 


AACCACCGAG 


240 


CGGTTCGCCT 


GAAACGGGCG 


CCGGCGCCGG 


TACTGTCGCG 


ACTACCCCCG 


CGTCGTCGCC 


300 


GGTGACGTTG 


GCGGAGACCG 


GTAGCACGCT 


GCTCTACCCG 


CTGTTCAACC 


TGTGGGGTCC 


360 


GGCCTTTCAC 


GAGAGGTATC 


CGAACGTCAC 


GATCACCGCT 


CAGGGCACCG 


GTTCTGGTGC 


420 


CGGGATCGCG 


CAGGCCGCCG 


CCGGGACGGT 


CAACATTGGG 


GCCTCCGACG 


CCTATCTGTC 


480 


GGAAGGTGAT 


ATGGCCGCGC 


ACAAGGGGCT 


GATGAACATC 


GCGCTAGCCA 


TCTCCGCTCA 


540 


GCAGGTCAAC 


TACAACCTGC 


CCGGAGTGAG 


CGAGCACCTC 


AAGCTGAACG 


GAAAAGTCCT 


600 


GGCGGCCATG 


TACCAGGGCA 


CCATCAAAAC 


CTGGGACGAC 


CCGCAGATCG 


CTGCGCTCAA 


660 


CCCCGGCGTG 


AACCTGCCCG 


GCACCGCGGT 


AGTTCCGCTG 


CACCGCTCCG 


ACGGGTCCGG 


720 


TGACACCTTC 


TTGTTCACCC 


AGTACCTGTC 


CAAGCAAGAT 


CCCGAGGGCT 


GGGGCAAGTC 


780 


GCCCGGCTTC 


GGCACCACCG 


TCGACTTCCC 


GGCGGTGCCG 


GGTGCGCTGG 


GTGAGAACGG 


840 


CAACGGCGGC 


ATGGTGACCG 


GTTGCGCCGA 


GACACCGGGC 


TGCGTGGCCT 


ATATCGGCAT 


900 


CAGCTTCCTC 


GACCAGGCCA 


GTCAACGGGG 


ACTCGGCGAG 


GCCCAACTAG 


GCAATAGCTC 


960 


TGGCAATTTC 


TTGTTGCCCG 


ACGCGCAAAG 


CATTCAGGCC 


GCGGCGGCTG 


GCTTCGCATC 


1020 


GAAAACCCCG 


GCGAACCAGG 


CGATTTCGAT 


GATCGACGGG 


CCCGCCCCGG 


ACGGCTACCC 


1080 


GATCATCAAC 


TACGAGTACG 


CCATCGTCAA 


CAACCGGCAA 


AAGGACGCCG 


CCACCGCGCA 


1140 


GACCTTGCAG 


GCATTTCTGC 


ACTGGGCGAT 


CACCGACGGC 


AACAAGGCCT 


CGTTCCTCGA 


1200 


CCAGGTTCAT 


TTCCAGCCGC 


TGCCGCCCGC 


GGTGGTGAAG 


TTGTCTGACG 


CGTTGATCGC 


1260 


GACGATTTCC 


AGCTAGCCTC 


GTTGACCACC 


ACGCGACAGC 


AACCTCCGTC 


GGGCCATCGG 


1320 


GCTGCTTTGC 


GGAGCATGCT 


GGCCCGTGCC 


GGTGAAGTCG 


GCCGCGCTGG 


CCCGGCCATC 


1380 


CGGTGGTTGG 


GTGGGATAGG 


TGCGGTGATC 


CCGCTGCTTG 


CGCTGGTCTT 


GGTGCTGGTG 


1440 


GTGCTGGTCA 


TCGAGGCGAT 


GGGTGCGATC 


AGGCTCAACG 


GGTTGCATTT 


CTTCACCGCC 


1500 


ACCGAATGGA 


ATCCAGGCAA 


CACCTACGGC 


GAAACCGTTG 


TCACCGACGC 


GTCGCCCATC 


1560 


CGGTCGGCGC 


CTACTACGGG 


GCGTTGCCGC 


TGATCGTCGG 


GACGCTGGCG 


ACCTCGGCAA 


1620 


TCGCCCTGAT 


CATCGCGGTG 


CCGGTCTCTG 


TAGGAGCGGC 


GCTGGTGATC 


GTGGAACGGC 


1680 


TGCCGAAACG 


GTTGGCCGAG 


GCTGTGGGAA 


TAGTCCTGGA 


ATTGCTCGCC 


GGAATCCCCA 


1740 


GCGTGGTCGT 


CGGTTTGTGG 


GGGGCAATGA 


CGTTCGGGCC 


GTTCATCGCT 


CATCACATCG 


1800 


CTCCGGTGAT 


CGCTCACAAC 


GCTCCCGATG 


TGCCGGTGCT 


GAACTACTTG 


CGCGGCGACC 


1860 


CGGGCAACGG 


GGAGGGCATG 


TTGGTGTCCG 


GTCTGGTGTT 


GGCGGTGATG 


GTCGTTCCCA 


1920 
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TTATCGCCAC CACCACTCAT GACCTGTTCC GGCAGGTGCC GGTGTTGCCC CGGGAGGGCG 1980 
CGATCGGGAA TTC 1993 
(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Met Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 80 

lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly lie Ala Gin Ala Ala 
85 • 90 95 

Ala Gly Thr Val Asn lie Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 HO 

Asp Met Ala Ala His Lys Gly Leu Met Asn lie Ala Leu Ala lie Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr lie Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin lie Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 * 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 
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Thr Pro Gly Cys Val Ala Tyr lie Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala lie Val Asn 
305 ' 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 

He Ala Thr He Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

GGTCTTGACC ACCACCTGGG TGTCGAAGTC GGTGCCCGGA TTGAAGTCCA GGTACTCGTG 60 

GGTGGGGCGG GCGAAACAAT AGCGACAAGC ATGCGAGCAG CCGCGGTAGC CGTTGACGGT 120 

GTAGCGAAAC GGCAACGCGG CCGCGTTGGG CACCTTGTTC AGCGCTGATT TGCACAACAC 180 

CTCGTGGAAG GTGATGCCGT CGAATTGTGG CGCGCGAACG CTGCGGACCA GGCCGATCCG 24 0 

CTGCAACCCG GCAGCGCCCG TCGTCAACGG GCATCCCGTT CACCGCGACG GCTTGCCGGG 300 

CCCAACGCAT ACCATTATTC GAACAACCGT TCTATACTTT GTCAACGCTG GCCGCTACCG 360 

AGCGCCGCAC AGGATGTGAT ATGCCATCTC TGCCCGCACA GACAGGAGCC AGGCCTTATG 4 20 

ACAGCATTCG GCGTCGAGCC CTACGGGCAG CCGAAGTACC TAGAAATCGC CGGGAAGCGC 480 

ATGGCGTATA TCGACGAAGG CAAGGGTGAC GCCATCGTCT TTCAGCACGG CAACCCCACG 54 0 

TCGTCTTACT TGTGGCGCAA CATCATGCCG CACTTGGAAG GGCTGGGCCG GCTGGTGGCC 600 
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TGCGATCTGA 


TCGGGATGGG 


CGCGTCGGAC 


AAGCTCAGCC 


CATCGGGACC 


CGACCGCTAT 


660 


AGCTATGGCG 


AGCAACGAGA 


CTTTTTGTTC 


GCGCTCTGGG 


ATGCGCTCGA 


CCTCGGCGAC 


720 


CACGTGGTAC 


TGGTGCTGCA 


CGACTGGGGC 


TCGGCGCTCG 


GCTTCGACTG 


GGCTAACCAG 


780 


CATCGCGACC 


GAGTGCAGGG 


GATCGCGTTC 


ATGGAAGCGA 


TCGTCACCCC 


GATGACGTGG 


840 


GCGGACTGGC 


CGCCGGCCGT 


GCGGGGTGTG 


TTCCAGGGTT 


TCCGATCGCC 


TCAAGGCGAG 


900 


CCAATGGCGT 


TGGAGCACAA 


CATCTTTGTC 


GAACGGGTGC 


TGCCCGGGGC 


GATCCTGCGA 


960 


CAGCTCAGCG 


ACGAGGAAAT 


GAACCACTAT 


CGGCGGCCAT 


TCGTGAACGG 


CGGCGAGGAC 


1020 


CGTCGCCCCA 


CGTTGTCGTG 


GCCACGAAAC 


CTTCCAATCG 


ACGGTGAGCC 


CGCCGAGGTC 


1080 


GTCGCGTTGG 


TCAACGAGTA 


CCGGAGCTGG 


CTCGAGGAAA 


CCGACATGCC 


GAAACTGTTC 


1140 


ATCAACGCCG 


AGCCCGGCGC 


GATCATCACC 


GGCCGCATCC 


GTGACTATGT 


CAGGAGCTGG 


1200 


CCCAACCAGA 


CCGAAATCAC 


AGTGCCCGGC 


GTGCATTTCG 


TTCAGGAGGA 


CAGCGATGGC 


1260 


GTCGTATCGT 


GGGCGGGCGC 


TCGGCAGCAT 


CGGCGACCTG 


GGAGCGCTCT 


CATTTCACGA 


1320 


GACCAAGAAT 


GTGATTTCCG 


GCGAAGGCGG 


CGCCCTGCTT 


GTCAACTCAT 


AAGACTTCCT 


1380 


GCTCCGGGCA 


GAGATTCTCA 


GGGAAAAGGG 


CACCAATCGC 


AGCCGCTTCC 


TTCGCAACGA 


1440 


GGTCGACAAA 


TATACGTGGC 


AGGACAAAGG 


TCTTCCTATT 


TGCCCAGCGA 


ATTAGTCGCT 


1500 


GCCTTTCTAT 


GGGCTCAGTT 


CGAGGAAGCC 


GAGCGGATCA 


CGCGTATCCG 


ATTGGACCTA 


1560 


TGGAACCGGT 


ATCATGAAAG 


CTTCGAATCA 


TTGGAACAGC 


GGGGGCTCCT 


GCGCCGTCCG 


1620 


ATCATCCCAC 


AGGGCTGCTC 


TCACAACGCC 


CACATGTACT 


ACGTGTTACT 


AGCGCCCAGC 


1680 


GCCGATCGGG 


AGGAGGTGCT 


GGCGCGTCTG 


ACGAGCGAAG 


GTATAGGCGC 


GGTCTTTCAT 


1740 


TACGTGCCGC 


TTCACGATTC 


GCCGGCCGGG 


CGTCGCT 






1777 



(2) INFORMATION FOR SEQ ID NO: 157: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

GAGATTGAAT CGTACCGGTC TCCTTAGCGG CTCCGTCCCG TGAATGCCCA TATCACGCAC 60 

GGCCATGTTC TGGCTGTCGA CCTTCGCCCC ATGCCCGGAC GTTGGTAAAC CCAGGGTTTG 120 

ATCAGTAATT CCGGGGGACG GTTGCGGGAA GGCGGCCAGG ATGTGCGTGA GCCGCGGCGC 180 

CGCCGTCGCC CAGGCGACCG CTGGATGCTC AGCCCCGGTG CGGCGACGTA GCCAGCGTTT 24 0 
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GGCGCGTGTC GTCCACAGTG GTACTCCGGT GACGACGCGG CGCGGTGCCT GGGTGAAGAC 
CGTGACCGAC GCCGCCGATT CAGA 
(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

GCGGTACCGC CGCGTTGCGC TGGCACGGGA CCTGTACGAC CTGAACCACT TCGCCTCGCG 60 

AACGAT T G AC GAACCGCTCG TGCGGCGGCT GTGGGTGCTC AAGGTGTGGG GTGATGTCGT 120 

CGATGACCGG CGCGGCACCC GGCCACTACG CGTCGAAGAC GTCCTCGCCG CCCGCAGCGA 180 

GCACGACTTC CAGCCCGACT CGATCGGCGT GCTGACCCGT CCTGTCGCTA TGGCTGCCTG 24 0 

GGAAGCTCGC GTTCGGAAGC GATTTGCGTT CCTCACTGAC CTCGACGCCG ACGAGCAGCG 300 

GTGGGCCGCC TGCGACGAAC GGCACCGCCG CGAAGTGGAG AACGCGCTGG CGGTGCTGCG 360 

GTCCTGATCA ACCTGCCGGC GATCGTGCCG TTCCGCTGGC ACGGTTGCGG CTGGACGCGG 420 

CTGAATCGAC TAGATGAGAG CAGTTGGGCA CGAATCCGGC TGTGGTGGTG AG C AAG AC AC 4 80 

GAGTACTGTC ATCACTATTG GATGCACTGG ATGACCGGCC TGATTCAGCA GGACCAATGG 54 0 

AACTGCCCGG GGCAAAACGT CTCGGAGATG ATCGGCGTCC CCTCGGAACC CTGCGGTGCT 600 

GGCGTCATTC GGACATCGGT CCGGCTCGCG GGATCGTGGT GACGCCAGCG CTGAAGGAGT 660 

GGAGCGCGGC GGTGCACGCG CTGCTGGACG GCCGGCAGAC GGTGCTGCTG CGTAAGGGCG 720 

GGATCGGCGA GAAGCGCTTC GAGGTGGCGG CCCACGAGTT CTTGTTGTTC CCGACGGTCG 780 

CGCACAGCCA CGCCGAGCGG GTTCGCCCCG AGCACCGCGA CCTGCTGGGC CCGGCGGCCG 84 0 

CCGACAGCAC CGACGAGTGT GTGCTACTGC GGGCCGCAGC GAAAGTTGTT GCCGCACTGC 900 

CGGTTAACCG GCCAGAGGGT CTGGACGCCA TCGAGGATCT GCACATCTGG ACCGCCGAGT 960 

CGGTGCGCGC CGACCGGCTC GACTTTCGGC CCAAGCACAA ACTGGCCGTC TTGGTGGTCT 102 

CGGCGATCCC GCTGGCCGAG CCGGTCCGGC TGGCGCGTAG GCCCGAGTAC GGCGGTTGCA 108 

CCAGCTGGGT GCAGCTGCCG GTGACGCCGA CGTTGGCGGC GCCGGTGCAC GACGAGGCCG 114 

CGCTGGCCGA GGTCGCCGCC CGGGTCCGCG AGGCCGTGGG TTGACTGGGC GGCATCGCTT 120 

GGGTCTGAGC TGTACGCCCA GTCGGCGCTG CGAGTGATCT GCTGTCGGTT CGGTCCCTGC 126 

TGGCGTCAAT TGACGGCGCG GGCAACAGCA GCATTGGCGG CGCCATCCTC CGCGCGGCCG 132 
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GCGCCCACCG CTACAACC 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 



CCGGCGGCAC 


CGGCGGCACC 


GGCGGTACCG 


GCGGCAACGG 


CGCTGACGCC 


GCTGCTGTGG 


60 


TGGGCTTCGG 


CGCGAACGGC 


GACCCTGGCT 


TCGCTGGCGG 


CAAAGGCGGT 


AACGGCGGAA 


120 


TAGGTGGGGC 


CGCGGTGACA 


GGCGGGGTCG 


CCGGCGACGG 


CGGCACCGGC 


GGCAAAGGTG 


180 


GCACCGGCGG 


TGCCGGCGGC 


GCCGGCAACG 


ACGCCGGCAG 


CACCGGCAAT 


CCCGGCGGTA 


240 


AGGGCGGCGA 


CGGCGGGATC 


GGCGGTGCCG 


GCGGGGCCGG 


CGGCGCGGCC 


GGCACCGGCA 


300 


ACGGCGGCCA 


TGCCGGCAAC 


C 








321 


(2) INFORMATION FOR SEQ ID NO: 160 


■ 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 



GAAGACCCGG 


CCCCGCCATA 


TCGATCGGCT 


CGCCGACTAC 


TTTCGCCGAA 


CGTGCACGCG 


60 


GCGGCGTCGG 


GCTGATCATC 


ACCGGTGGCT 


ACGCGCCCAA 


CCGCACCGGA 


TGGCTGCTGC 


120 


CGTTCGCCTC 


CGAACTCGTC 


ACTTCGGCGC 


AAGCCCGACG 


GCACCGCCGA 


ATCACCAGGG 


180 


CGGTCCACGA 


TTCGGGTGCA 


AAGATCCTGC 


TGCAAATCCT 


GCACGCCGGA 


CGCTACGCCT 


240 


ACCACCCACT 


TGCGGTCAGC 


GCCTCGCCGA 


TCAAGGCGCC 


GATCACCCCG 


TTTCGTCCGC 


300 


GAGCACTATC 


GGCTCGCGGG 


GTCGAAGCGA 


CCATCGCGGA 


TTTCGCCCGC 


TGCGCGCAGT 


360 


TGGCCCGCGA 


TGCCGGCTAC 


GACGGCGTCG 


AAATCATGGG 


CAGCGAAGGG 


TATCTGCTCA 


420 


ATCAGTTCCT 


GGCGCCGCGC 


ACCAACAAGC 


GCACCGACTC 


GTGGGGCGGC 


ACACCGGCCA 


480 


ACCGTCGCCG 


GT 










492 



(2) INFORMATION FOR SEQ ID NO: 161: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

Phe Ala Gin His Leu Val Glu Gly Asp Ala Val Glu Leu Trp Arg Ala 
15 10 15 

Asn Ala Ala Asp Gin Ala Asp Pro Leu Gin Pro Gly Ser Ala Arg Arg 
20 25 30 

Gin Arg Ala Ser Arg Ser Pro Arg Arg Leu Ala Gly Pro Asn Ala Tyr 
35 40 45 

His Tyr Ser Asn Asn Arg Ser lie Leu Cys Gin Arg Trp Pro Leu Pro 
50 55 60 

Ser Ala Ala Gin Asp Val lie Cys His Leu Cys Pro His Arg Gin Glu 
65 70 75 80 

Pro Gly Leu Met Thr Ala Phe Gly Val Glu Pro Tyr Gly Gin Pro Lys 
85 90 95 

Tyr Leu Glu lie Ala Gly Lys Arg Met Ala Tyr lie Asp Glu Gly Lys 
100 105 110 

Gly Asp Ala lie Val Phe Gin His Gly Asn Pro Thr Ser Ser Tyr Leu 
115 120 125 

Trp Arg Asn lie Met Pro His Leu Glu Gly Leu Gly Arg Leu Val Ala 
130 135 140 

Cys Asp Leu lie Gly Met Gly Ala Ser Asp Lys Leu Ser Pro Ser Gly 
145 150 155 160 

Pro Asp Arg Tyr Ser Tyr Gly Glu Gin Arg Asp Phe Leu Phe Ala Leu 
165 170 175 

Trp Asp Ala Leu Asp Leu Gly Asp His Val Val Leu Val Leu His Asp 
180 * 185 190 

Trp Gly Ser Ala Leu Gly Phe Asp Trp Ala Asn Gin His Arg Asp Arg 
195 200 205 

Val Gin Gly lie Ala Phe Met Glu Ala lie Val Thr Pro Met Thr Trp 
210 ^ 215 220 

Ala Asp Trp Pro Pro Ala Val Arg Gly Val Phe Gin Gly Phe Arg Ser 
225 230 235 240 

Pro Gin Gly Glu Pro Met Ala Leu Glu His Asn lie Phe Val Glu Arg 
245 250 255 

Val Leu Pro Gly Ala lie Leu Arg Gin Leu Ser Asp Glu Glu Met Asn 
260 265 270 



9816646A2_1_> 



WO 98/16646 



PCTAJS97/18293 



165 



His Tyr Arg Arg Pro Phe Val Asn Gly Gly Glu Asp Arg Arg Pro Thr 
275 280 285 

Leu Ser Trp Pro Arg Asn Leu Pro lie Asp Gly Glu Pro Ala Glu Val 
290 295 300 

Val Ala Leu Val Asn Glu Tyr Arg Ser Trp Leu Glu Glu Thr Asp Met 
305 310 315 320 

Pro Lys Leu Phe lie Asn Ala Glu Pro Gly Ala lie lie Thr Gly Arg 
325 330 335 

lie Arg Asp Tyr Val Arg Ser Trp Pro Asn Gin Thr Glu lie Thr Val 
340 345 350 

Pro Gly Val His Phe Val Gin Glu Asp Ser Asp Gly Val Val Ser Trp 
355 360 365 

Ala Gly Ala Arg Gin His Arg Arg Pro Gly Ser Ala Leu lie Ser Arg 
370 375 380 

Asp Gin Glu Cys Asp Phe Arg Arg Arg Arg Arg Pro Ala Cys Gin Leu 
385 390 " 395 ' 400 

lie Arg Leu Pro Ala Pro Gly Arg Asp Ser Gin Gly Lys Gly His Gin 
4 05 410 ^ ~ 415 

Ser Gin Pro Leu Pro Ser Gin Arg Gly Arg Gin lie Tyr Val Ala Gly 
420 4 25 4 30 

Gin Arg Ser Ser Tyr Leu Pro Ser Glu Leu Val Ala Ala Phe Leu Trp 
4 35 4 40 44 5 

Ala Gin Phe Glu Glu Ala Glu Arg lie Thr Arg lie Arg Leu Asp Leu 
4 50 4 55 4 60 

Trp Asn Arg Tyr His Glu Ser Phe Glu Ser Leu Glu Gin Arg Gly Leu 
465 470 475 480 

. Leu Arg Arg Pro lie lie Pro Gin Gly Cys Ser His Asn Ala His Met 

485 490 495 

Tyr Tyr Val Leu Leu Ala Pro Ser Ala Asp Arg Glu Glu Val Leu Ala 
500 505 510 

Arg Leu Thr Ser Glu Gly lie Gly Ala Val Phe His Tyr Val Pro Leu 
515 520 525 

His Asp Ser Pro Ala Gly Arg Arg 
530 535 

(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 284 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

Asn Glu Ser Ala Pro Arg Ser Pro Met Leu Pro Ser Ala Arg Pro Arg 
15 10 15 

Tyr Asp Ala lie Ala Val Leu Leu Asn Glu Met His Ala Gly His Cys 
20 25 30 

Asp Phe Gly Leu Val Gly Pro Ala Pro Asp lie Val Thr Asp Ala Ala 
35 4 0 4 5 

Gly Asp Asp Arg Ala Gly Leu Gly Val Asp Glu Gin Phe Arg His Val 
50 55 60 

Gly Phe Leu Glu Pro Ala Pro Val Leu Val Asp Gin Arg Asp Asp Leu 
65 70 75 " 80 

Gly Gly Leu Thr Val Asp Trp Lys Val Ser Trp Pro Arg Gin Arg Gly 
85 90 95 

Ala Thr Val Leu Ala Ala Val His Glu Trp Pro Pro lie Val Val His 
100 105 110 

Phe Leu Val Ala Glu Leu Ser Gin Asp Arg Pro Gly Gin His Pro Phe 
115 120 " 125 

Asp Lys Asp Val Val Leu Gin Arg His Trp Leu Ala Leu Arg Arg Ser 
130 135 14 0 

Glu Thr Leu Glu His Thr Pro His Gly Arg Arg Pro Val Arg Pro Arg 
145 150 155 160 

His Arg Gly Asp Asp Arg Phe His Glu Arg Asp Pro Leu His Ser Val 
165 170 175 

Ala Met Leu Val Ser Pro Val Glu Ala Glu Arg Arg Ala Pro Val Val 
180 185 190 

Gin His Gin Tyr His Val Val Ala Glu Val Glu Arg lie Pro Glu Arg 
195 200 205 

Glu Gin Lys Val Ser Leu Leu Ala He Ala He Ala Val Gly Ser Arg 
210 215 220 

Trp Ala Glu Leu Val Arg Arg Ala His Pro Asp Gin lie Ala Gly His 
225 230 235 240 

Gin Pro Ala Gin Pro Phe Gin Val Arg His Asp Val Ala Pro Gin Val 
245 250 255 

Arg Arg Arg Gly Val Ala Val Leu Lys Asp Asp Gly Val Thr Leu Ala 
260 265 270 

Phe Val Asp He Arg His Ala Leu Pro Gly Asp Phe 
275 280 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 264 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

ATGAACATGT CGTCGGTGGT GGGTCGCAAG GCCTTTGCGC GATTCGCCGG CTACTCCTCC 60 

GCCATGCACG CGATCGCCGG TTTCTCCGAT GCGTTGCGCC AAGAGCTGCG GGGTAGCGGA 120 

ATCGCCGTCT CGGTGATCCA CCCGGCGCTG ACCCAGACAC CGCTGTTGGC CAACGTCGAC 180 

CCCGCCGACA TGCCGCCGCC GTTTCGCAGC CTCACGCCCA TTCCCGTTCA CTGGGTCGCG 24 0 

GCAGCGGTGC TTGACGGTGT GGCG 2 64 
(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 



TAGTCGGCGA 


CGATGACGTC 


GCGGTCCAGG 


CCGACCGCTT 


CAAGCACCAG 


CGCGACCACG 


60 


AAGCCGGTGC 


GATCCTTACC 


CGCGAAGCAG 


TGGGTGAGCA 


CCGGGCGTCC 


GGCGGCAAGC 


120 


AGTGTGACGA 


CACGATGTAG 


CGCGCGCTGT 


GCTCCATTGC 


GCGTTGGGAA 


TTGGCGATAC 


180 


TCGTCGGTCA 


TGTAGCGGGT 


GGCCGCGTCA 


TTTATCGACT 


GGCTGGATTC 


GCCGGACTCG 


240 


CCGTTGGACC 


CGTCATTGGT 


TAGCAGCCTC 


TTGAATGCGG 


TTTCGTGCGG 


CGCTGAGTCG 


300 


TCGGCGTCAT 


CATCGGCGAG 


GTCGGGGAAC 


GGCAGCAGGT 


GGACGTCGAT 


GCCGTCCGGA 


360 


ACCCGTCCTG 


GACCGCGGCG 


GGCAACCTCC 


CGGGACGACC 


GCAGGTCGGC 


AACGTCGGTG 


420 


ATCCCCAGCC 


GGCGCAGCGT 


TGCCCCTCGT 


GCCGAATTCG 


GCACGAGGCT 


GGCGAGCCAC 


480 


CGGGCATCAC 


CAAGCAACGC 


TTGCCCAGTA 


CGGATCGTCA 


CTTCCGCATC 


CGGCAGACCA 


540 


ATCTCCTCGC 


CGCCCATCGT 


CAGATCCCGC 


TCGTGCGTTG 


ACAAGAACGG 


CCGCAGATGT 


600 


GCCAGCGGGT 


ATCGGAGATT 


GAACCGCGCA 


CGCAGTTCTT 


CAATCGCTGC 


GCGCTGCCGC 


660 


ACTATTGGCA 


CTTTCCGGCG 


GTCGCGGTAT 


TCAGCAAGCA 


TGCGAGTCTC 


GACGAACTCG 


720 


CCCCACGTAA 


CCCACGGCGT 


AGCTCCCGGC 


GTGACGCGGA 


GGATCGGCGG 


GTGATCTTTG 


780 


CCGCCACGCT 


CGTAGCCGTT 


GATCCACCGC 


TTCGCGGTGC 


CGGCGGGGAG 


GCCGATCAGC 


840 
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TTATCGACCT 


CGGCGTATGC 


CGACGGCAAb 




TPHTrGAGGT 


CAAGAACTCC 


900 


ACCATCGGCA 


CCGGCACCAA 


GGTGCCGCAC 


CTGACCTACG 


TCGGCGACGC 


CGACATCGGC 


q a n 


GAGTACAGCA 


ACATCGGCGC 


CTCCAGCGTG 


TTCGTCAACT 


ACGACGGTAC 


GTCCAAACGG 


1020 


CGCACCACCG 


TCGGTTCGCA 


CGTACGGACC 


GGGTCCGACA 


CCATGTTCGT 


GGCCCCAGTA 


1080 


ACCATCGGCG 


ACGGCGCGTA 


TACCGGGGCC 


GGCACAGTGG 


TGCGGGAGGA 


TGTCCCGCCG 


1140 


GGGGCGCTGG 


CAGTGTCGGC 


GGGTCCGCAA 


C 






1171 



(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 
GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 
ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGCGCCGGC GGCACCAGCT 
TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 
GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGCCGCC 
(2) INFORMATION FOR SEQ ID NO: 166 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 
CCTCGCCACC ATGGGCGGGC AGGGCGGTAG CGGTGGCGCC GGCTCTACCC CAGGCGCCAA 60 
GGGCGCCCAC GGCTTCACTC CAACCAGCGG CGGCGACGGC GGCGACGGCG GCAACGGCGG 120 
CAACTCCCAA GTGGTCGGCG GCAACGGCGG CGACGGCGGC AATGGCGGCA ACGGCGGCAG 180 
CGCCGGCACG GGCGGCAACG GCGGCCGCGG CGGCGACGGC GCGTTTGGTG GCATGAGTGC 24 0 

CAACGCCACC AACCCTGGTG AAAACGGGCC AAACGGTAAC CCCGGCGGCA ACGGTGGCGC 
CGGC 

(2) INFORMATION FOR SEQ ID NO: 167: 



300 
304 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



GTGGGACGCT 


GCCGAGGCTG 


TATAACAAGG 


ACAACATCGA 


CCAGCGCCGG 


CTCGGTGAGC 


60 


TGATCGACCT 


ATTTAACAGT 


GCGCGCTTCA 


GCCGGCAGGG 


CGAGCACCGC 


GCCCGGGATP 


1 z. u 


TGATGGGTGA 


GGTCTACGAA 


TACTTCCTCG 


GCAATTTCGC 


TCGCGCGGAA 




i ft n 


GTGGCGAGTT 


CTTTACCCCG 


CCCAGCGTGG 


TCAAGGTGAT 


CGTGGAGGTG 




Z *i U 


CGAGTGGGCG 


GGTGTATGAC 


CCGTGCTGCG 


GTTCCGGAGG 


CATGTTTGTG 


CAGACCGAPA 


JUU 


AGTTCATCTA 


CGAACACGAC 


GGCGATCCGA 


AGGATGTCTC 


GATCTATGGC 


CAC^AAAPPA 




TTGAGGAGAC 


CTGGCGGATG 


GCGAAGATGA 


ACCTCGCCAT 


CCACGGCATC 


GAP A AC A ACZC 


/i 9 n 
fl z u 


GGCTCGGCGC 


CCGATGGAGT 


GATACCTTCG 


CCCGCGACCA 


GCACCCGGAC 




d ft n 
sou 


ACTACGTGAT 


GGCCAATCCG 


CCGTTCAACA 


TCAAAGACTG 


GGCCCGCAAC 


GAGGAAPAPP 


J'JU 


CACGCTGGCG 


CTTCGGTGTT 


CCGCCCGCCA 


ATAACGCCAA 


CTACGCATGG 


ATTCAGCAPA 


uuu 


TCCTGTACAA 


CTTGGCGCCG 


GGAGGTCGGG 


CGGGCGTGGT 


GATGGCCAAC 


GGGTCGATGT 




CGTCGAACTC 


CAACGGCAAG 


GGGGATATTC 


GCGCGCAAAT 


CGTGGAGGCG 


GATTTGGTTT 




CCTGCATGGT 


CGCGTTACCC 


ACCCAGCTGT 


TCCGCAGCAC 


CGGAATCCCG 


GTGTGCCTGT 


i ft n 


KjKd 1 111 Iv^kaL* 




(jULataUAbG 1 A 


AGCAAGGGTC 


TATCAACCGG 


TGCGGGCAGG 


840 


TGCTGTTCAT 


CGACGCTCGT 


GAACTGGGCG 


ACCTAGTGGA 


CCGGGCCGAG 


CGGGCGCTGA 


900 


CCAACGAGGA 


GATCGTCCGC 


ATCGGGGATA 


CCTTCCACGC 


GAGCACGACC 


ACCGGCAACG 


960 


CCGGCTCCGG 


TGGTGCCGGC 


GGTAATGGGG 


GCACTGGCCT 


CAACGGCGCG 


GGCGGTGCTG 


1020 


GCGGGGCCGG 


CGGCAACGCG 


GGTGTCGCCG 


GCGTGTCCTT 


CGGCAACGCT 


GTGGGCGGCG 


1080 


ACGGCGGCAA 


CGGCGGCAAC 


GGCGGCCACG 


GCGGCGACGG 


CACGACGGGC 


GGCGCCGGCG 


1140 


GCAAGGGCGG 


CAACGGCAGC 


AGCGGTGCCG 


CCAGCGGCTC 


AGGCGTCGTC 


AACGTCACCG 


1200 


CCGGCCACGG 


CGGCAACGGC 


GGCAATGGCG 


GCAACGGCGG 


CAACGGCTCC 


GCGGGCGCCG 


1260 


GCGGCCAGGG 


CGGTGCCGGC 


GGCAGCGCCG 


GCAACGGCGG 


CCACGGCGGC 


GGTGCCACCG 


1320 


GCGGCGCCAG 


CGGCAAGGGC 


GGCAACGGCA 


CCAGCGGTGC 


CGCCAGCGGC 


TCAGGCGTCA 


1380 


TCAACGTCAC 


CGCCGGCCAC 


GGCGGCAACG 


GCGGCAATGG 


CCGCAACGGC 


GGCAACGGC 


1439 



(2) INFORMATION FOR SEQ ID NO: 1 68 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

GGGCCGGCGG GGCCGGATTT TCTCGTGCCT TGATTGTCGC TGGGGATAAC GGCGGTGATG 60 

GTGGTAACGG CGGGATGGGC GGGGCTGGCG GGGCTGGCGG CCCCGGCGGG GCCGGCGGCC 120 

TGATCAGCCT GCTGGGCGGC CAAGGCGCCG GCGGGGCCGG CGGGACCGGC GGGGCCGGCG 18 0 

GTGTTGGCGG TGACGGCGGG GCCGGCGGCC CCGGCAACCA GGCCTTCAAC GCAGGTGCCG 24 0 

GCGGGGCCGG CGGCCTGATC AGCCTGCTGG GCGGCCAAGG CGCCGGCGGG GCCGGCGGGA 300 

CCGGCGGGGC CGGCGGTGTT GGCGGTGAC 32 9 
(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 
GCAACGGTGG CAACGGCGGC ACCAGCACGA CCGTGGGGAT GGCCGGAGGT AACTGTGGTG 60 
CCGCCGGGCT GATCGGCAAC 80 
(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

GGGCTGTGTC GCACTCACAC CGCCGCATTC GGCGACGTTG GCCGCCCAAT ATCCAGCTCA 60 

AGGCCTACTA CTTACCGTCG GAGGACCGCC GCATCAAGGT GCGGGTCAGC GCCCAAGGAA 120 

TCAAGGTCAT CGACCGCGAC GGGCATCGAG GCCGTCGTCG CGCGGCTCGG GCAGGATCCG 180 
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CCCCGGCGCA CTTCGCGCGC CAAGCGGGCT CATCGCTCCG AACGGCGGCG ATCCTGTGAG 24 0 

CACAACTGAT GGCGCGCAAC GAGATTCGTC CAATTGTCAA GCCGTGTTCG ACCGCAGGGA 300 

CCGGTTATAC GTATGTCAAC CTATGTCACT CGCAAGAACC GGCATAACGA TCCCGTGATC 360 

CGCCGACAGC CCACGAGTGC AAGACCGTTA CA 392 
(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 



ACCGGCGCCA 


CCGGCGGCAC 


CGGGTTCGCC 


GGTGGCGCCG 


GCGGGGCCGG 


CGGGCAGGGC 


60 


GGTATCAGCG 


GTGCCGGCGG 


CACCAACGGC 


TCTGGTGGCG 


CTGGCGGCAC 


CGGCGGACAA 


120 


GGCGGCGCCG 


GGGGCGCTGG 


CGGGGCCGGC 


GCCGATAACC 


CCACCGGCAT 


CGGCGGCGCC 


180 


GGCGGCACCG 


GCGGCACCGG 


CGGAGCGGCC 


GGAGCCGGCG 


GGGCCGGTGG 


CGCCATCGGT 


240 


ACCGGCGGCA 


CCGGCGGCGC 


GGTGGGCAGC 


GTCGGTAACG 


CCGGGATCGG 


CGGTACCGGC 


300 


GGTACGGGTG 


GTGTCGGTGG 


TGCTGGTGGT 


GCAGGTGCGG 


CTGCGGCCGC 


TGGCAGCAGC 


360 


GCTACCGGTG 


GCGCCGGGTT 


CGCCGGCGGC 


GCCGGCGGAG 


AAGGCGGACC 


GGGCGGCAAC 


420 


AGCGGTGTGG 


GCGGCACCAA 


CGGCTCCGGC 


GGCGCCGGCG 


GTGCAGGCGG 


CAAGGGCGGC 


480 


ACCGGAGGTG 


CCGGCGGGTC 


CGGCGCGGAC 


AACCCCACCG 


GTGCTGGTTT 


CGCCG 


535 



(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

CCGACGTCGC CGGGGCGATA CGGGGGTCAC CGACTACTAC ATCATCCGCA CCGAGAATCG 60 

GCCGCTGCTG CAACCGCTGC GGGCGGTGCC GGTCATCGGA GATCCGCTGG CCGACCTGAT 120 

CCAGCCGAAC CTGAAGGTGA TCGTCAACCT GGGCTACGGC GACCCGAACT ACGGCTACTC 180 

GACGAGCTAC GCCGATGTGC GAACGCCGTT CGGGCTGTGG CCGAACGTGC CGCCTCAGGT 24 0 
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CATCGCCGAT 


GCCCTGGCCG 


CCGGAACACA 


AGAAGGCATC 


CTTGACTTCA 


CGGCCGACCT 


300 


GCAGGCGCTG 


TCCGCGCAAC 


CGCTCACGCT 


CCCGCAGATC 


CAGCTGCCGC 


AACCCGCCGA 


360 


TCTGGTGGCC 


GCGGTGGCCG 


CCGCACCGAC 


GCCGGCCGAG 


GTGGTGAACA 


CGCTCGCCAG 


420 


GATCATCTCA 


ACCAACTACG 


CCGTCCTGCT 


GCCCACCGTG 


GACATCGCCC 


TCGCCTGGTC 


480 


ACCACCCTGC 


CGCTGTACAC 


CACCCAACTG 


TTCGTCAGGC 


AACTCGCTGC 


GGGCAATCTG 


540 


ATCAACGCGA 


TCGGCTATCC 


CCTGGCGGCC 


ACCGTAGGTT 


TAGGCACGAT 


CGATAGCGGG 


600 


CGGCGTGGAA 


TTGCTCACCC 


TCCTCGCGGC 


GGCCTCGGAC 


ACCGTTCGAA 


ACATCGAGGG 


660 


CCTCGTCACC 


TAACGGATTC 


CCGACGGCAT 








690 



(2) INFORMATION FOR SEQ ID NO: 17 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 07 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

ACGGTGACGG CGGTACTGGC GGCGGCCACG GCGGCAACGG CGGGAATCCC GGGTGGCTCT 60 

TGGGCACAGC CGGGGGTGGC GGCAACGGTG GCGCCGGCAG CACCGGTACT GCAGGTGGCG 120 

GCTCTGGGGG CACCGGCGGC GACGGCGGGA CCGGCGGGCG TGGCGGCCTG TTAATGGGCG 180 

CCGGCGCCGG CGGGCACGGT GGCACTGGCG GCGCGGGCGG TGCCGGTGTC GACGGTGGCG 24 0 

GCGCCGGCGG GGCCGGCGGG GCCGGCGGCA ACGGCGGCGC CGGGGGTCAA GCCGCCCTGC 300 

TGTTCGGGCG CGGCGGCACC GGCGGAGCCG GCGGCTACGG CGGCGATGGC GGTGGCGGCG 3 60 

GTGACGGCTT CGACGGCACG ATGGCCGGCC TGGGTGGTAC CGGTGGC 4 07 
(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
GATCGGTCAG CGCATCGCCC TCGGCGGCAA GCGATTCCGC GGTCTCACCG AAGAACATCG 60 
TGCACGCGGC GGCGCGGACC AGCCCGCTGC GCTGCGGCGC GTCGAACGCC TCCAGCAGGC 120 
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ACAGCCACal C 


LI 1 <jLjHj(jL.L. 


1 CjCbACjCaCCaA 


ACACGTCGGT 


GTCACCGGTG 


TAGATCGCCG 


180 


GGATGCCCGC 


CTCCGCCAAC 


GCATTCCGGC 


ACGCCCGCGC 


GTCTTTGTGA 


TGCTCGACGA 


240 


TCACCGCGAT 


GTCTGCGGCC 


ACCACGGGCC 


GCCCGGCGAA 


GGTGGCCCCG 


CTGGCCAGTA 


300 


GCGCCGCGAC 


GTCGGCGGCC 


AGGTCGTCGG 


GGATGTGCCG 


GCGCAGCGCT 


CCGGCGCGAC 


360 


GCCCGAAAAA 


CGACCCCTCA 


CCCAGCTGGG 


TCCCGCTGGC 


ATATCCCTTG 


CCGTCCTGGG 


420 


CGATATTGGA 


CGCGCATGCC 


CCGACCGCGT 


ACAGGCCGGC 


CACCACCG 




468 



(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

GGTGGTAACG GCGGCCAGGG TGGCATCGGC GGCGCCGGCG AGAGAGGCGC CGACGGCGCC 60 

GGCCCCAATG CTAACGGCGC AAACGGCGAG AACGGCGGTA GCGGTGGTAA CGGTGGCGAC 120 

GGCGGCGCCG GCGGCAATGG CGGCGCGGGC GGCAACGCGC AGGCGGCCGG GTACACCGAC 180 

GGCGCCACGG GCACCGGCGG CGACGGCGGC AACGGCGGC 219 
(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 



TAGCTCCGGC 


GAGGGCGGCA 


AGGGCGGCGA 


CGGTGGCCAC 


GGCGGTGACG 


GCGTCGGCGG 


60 


CAACAGTTCC 


GTCACCCAAG 


GCGGCAGCGG 


CGGTGGCGGC 


GGCGCCGGCG 


GCGCCGGCGG 


120 


CAGCGGCTTT 


TTCGGCGGCA 


AGGGCGGCTT 


CGGCGGCGAC 


GGCGGTCAGG 


GCGGCCCCAA 


180 


CGGCGGCGGT 


ACCGTCGGCA 


CCGTGGCCGG 


TGGCGGCGGC 


AACGGCGGTG 


TCGGCGGCCG 


240 


GGGCGGCGAC 


GGCGTCTTTG 


CCGGTGCCGG 


CGGCCAGGGC 


GGCCTCGGTG 


GGCAGGGCGG 


300 


CAATGGCGGC 


GGCTCCACCG 


GCGGCAACGG 


CGGCCTTGGC 


GGCGCGGGCG 


GTGGCGGAGG 


360 


CAACGCCCCG 


GCTCGTGCCG 


AATCCGGGCT 


GACCATGGAC 


AGCGCGGCCA 


AGTTCGCTGC 


420 
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CATCGCATCA GGCGCGTACT GCCCCGAACA CCTGGAACAT CACCCGAGTT AGCGGGGCGC 4 80 

ATTTCCTGAT CACC 4 94 

(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 60 

TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 12 0 

CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 180 

GCCAGAGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC 220 
(2) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 388 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 8: 



ATGGCGGCAA 


CGGGGGCCCC 


GGCGGTGCTG 


GCGGGGCCGG 


CGACTACAAT 


TTCCAACGGC 


60 


GGGCAGGGTG 


GTGCCGGCGG 


CCAAGGCGGC 


CAAGGCGGCC 


TGGGCGGGGC 


AAGCACCACC 


120 


TGATCGGCCT 


AGCCGCACCC 


GGGAAAGCCG 


ATCCAACAGG 


CGACGATGCC 


GCCTTCCTTG 


180 


CCGCGTTGGA 


CCAGGCCGGC 


ATCACCTACG 


CTGACCCAGG 


CCACGCCATA 


ACGGCCGCCA 


240 


AG GCG ATGTG 


TGGGCTGTGT 


GCTAACGGCG 


TAACAGGTCT 


ACAGCTGGTC 


GCGGACCTGC 


300 


GGGACTACAA 


TCCCGGGCTG 


ACCATGGACA 


GCGCGGCCAA 


GTTCGCTGCC 


ATCGCATCAG 


360 


GCGCGTACTG 


CCCCGAACAC 


CTGGAACA 








388 



(2) INFORMATION FOR SEQ ID NO: 17 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 00 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 60 

ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGCGCCGGC GGCACCAGCT 12 0 

TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 18 0 

GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGCCGCCGGC ACCACAGGCG 24 0 

GCGACGGCGG GGCCGGCGGG GCCGGCGGAA CCGGCGGAAC CGGCGGAGCC GCCGGCACCG 300 

GCACCGGCGG CCAACAAGGC AACGGCGGCA ACGGCGGCAC CGGCGGCAAA GGCGGCACCG 3 60 

GCGGCGACGG TGCACTCTCA GGCAGCACCG GTGGTGCCGG 4 00 
(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 538 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 
GGCAACGGCG GCAACGGCGG CATCGCCGGC ATTGGGCGGC AACGGCGTTC CGGGACGGGC 60 

AGCGGCAACG GCGGCCAACG GCGGCAGCGG CGGCAACGGC GGCAACGCCG GCATGGGCGG 120 

CAACAGCGGC ACCGGCAGCG GCGACGGCGG TGCCGGCGGG AACGGCGGCG CGGCGGGCAC 180 

GGGCGGCACC GGCGGCGACG GCGGCCTCAC CGGTACTGGC GGCACCGGCG GCAGCGGTGG 24 0 

CACCGGCGGT GACGGCGGTA ACGGCGGCAA CGGAGCAGAT AACACCGCAA ACATGACTGC 300 

GCAGGCGGGC GGTGACGGTG GCAACGGCGG CGACGGTGGC TTCGGCGGCG GGGCCGGGGC 3 60 

CGGCGGCGGT GGCTTGACCG CTGGCGCCAA CGGCACCGGC GGGCAAGGCG GCGCCGGCGG 4 20 

CGATGGCGGC AACGGGGCCA TCGGCGGCCA CGGCCCACTC ACTGACGACC CCGGCGGCAA 4 80 

CGGGGGCACC GGCGGCAACG GCGGCACCGG CGGCACCGGC GGCGCGGGCA TCGGCAGC 538 
(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 
TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 
CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 
GCCACGGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC CGGTGGTGCC GGCGGCACC 
(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 985 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 



AGCAGCGCTA 


CCGGTGGCGC 


CGGGTTCGCC 


GGCGGCGCCG 


GCGGAGAAGG 


CGGAGCGGGC 


60 


GGCAACAGCG 


GTGTGGGCGG 


CACCAACGGC 


TCCGGCGGCG 


CCGGCGGTGC 


AGGCGGCAAG 


120 


GGCGGCACCG 


GAGGTGCCGG 


CGGGTCCGGC 


GCGGACAACC 


CCACCGGTGC 


TGGTTTCGCC 


180 


GGTGGCGCCG 


GCGGCACAGG 


TGGCGCGGCC 


GGCGCCGGCG 


GGGCCGGCGG 


GGCGACCGGT 


240 


ACCGGCGGCA 


CCGGCGGCGT 


TGTCGGCGCC 


ACCGGTAGTG 


CAGGCATCGG 


CGGGGCCGGC 


300 


GGCCGCGGCG 


GTGACGGCGG 


CGATGGGGCC 


AGCGGTCTCG 


GCCTGGGCCT 


CTCCGGCTTT 


360 


GACGGCGGCC 


AAGGCGGCCA 


AGGCGGGGCC 


GGCGGCAGCG 


CCGGCGCCGG 


CGGCATCAAC 


420 


GGGGCCGGCG 


GGGCCGGCGG 


CAACGGCGGC 


GACGGCGGGG 


ACGGCGCAAC 


CGGTGCCGCA 


480 


GGTCTCGGCG 


ACAACGGCGG 


GGTCGGCGG-T 


GACGGTGGGG 


CCGGTGGCGC 


CGCCGGCAAC 


540 


GGCGGCAACG 


CGGGCGTCGG 


CCTGACAGCC 


AAGGCCGGCG 


ACGGCGGCGC 


CGCGGGCAAT 


600 


GGCGGCAACG 


GGGGCGCCGG 


CGGTGCTGGC 


GGGGCCGGCG 


ACAACAATTT 


CAACGGCGGC 


660 


CAGGGTGGTG 


CCGGCGGCCA 


AGGCGGCCAA 


GGCGGCTTGG 


GCGGGGCAAG 


CACCACCTGA 


720 


TCGGCCTAGC 


CGCACCCGGG 


AAAGCCGATC 


CAACAGGCGA 


CGATGCCGCC 


TTCCTTGCCG 


780 


CGTTGGACCA 


GGCCGGCATC 


ACCTACGCTG 


ACCCAGGCCA 


CGCCATAACG 


GCCGCCAAGG 


840 


CGATGTGTGG 


GCTGTGTGCT 


AACGGCGTAA 


CAGGTCTACA 


GCTGGTCGCG 


GACCTGCGGG 


900 


AATACAATCC 


CGGGCTGACC 


ATGGACAGCG 


CGGCCAAGTT 


CGCTGCCATC 


GCATCAGGCG 


960 


CGTACTGCCC 


CGAACACCTG 


GAACA 








985 



(2) INFORMATION FOR SEQ ID NO: 183: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2138 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sinqle 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 



CGGCACGAGG 


ATCGGTACCC 


CGCGGCATCG 


GCAGCTGCCG 


ATTCGCCGGG 


TTTCCCCACC 


60 


CGAGGAAAGC 


CGCTACCAGA 


TGGCGCTGCC 


GAAGTAGGGC 


GATCCGTTCG 


CGATGCCGGC 


120 


ATGAACGGGC 


GGCATCAAAT 


TAGTGCAGGA 


ACCTTTCAGT 


TTAGCGACGA 


TAATGGCTAT 


180 


AGCACTAAGG 


AGGATGATCC 


GATATGACGC 


AGTCGCAGAC 


CGTGACGGTG 


GATCAGCAAG 


240 


7\ t\ m m m m ^» t\ t\ 

AGATTTTGAA 


CAGGGCCAAC 


GAGGTGGAGG 


CCCCGATGGC 


GGACCCACCG 


ACTGATGTCC 


300 


CCATCACACC 


GTGCGAACTC 


ACGGCGGCTA 


AAAACGCCGC 


CCAACAGCTG 


GTATTGTCCG 


360 


CCGACAACAT 


GCGGGAATAC 


CTGGCGGCCG 


GTGCCAAAGA 


GCGGCAGCGT 


CTGGCGACCT 


420 


CGCTGCGCAA 


CGCGGCCAAG 


GCGTATGGCG 


AGGTTGATGA 


GGAGGCTGCG 


ACCGCGCTGG 


480 


ACAACGACGG 


CGAAGGAACT 


GTGCAGGCAG 


AATCGGCCGG 


GGCCGTCGGA 


GGGGACAGTT 


540 


CGGCCGAACT 


AACCGATACG 


CCGAGGGTGG 


CCACGGCCGG 


TGAACCCAAC 


TTCATGGATC 


600 


rn TV TV TV /** TV T\ /"» /"^ 

TCAAAGAAGC 


GGCAAGGAAG 


CTCGAAACGG 


GCGACCAAGG 


CGCATCGCTC 


GCGCACTTTG 


660 


CGGATGGGTG 


GAACACTTTC 


AACCTGACGC 


TGCAAGGCGA 


CGTCAAGCGG 


TTCCGGGGGT 


720 


rn rri tv tv tv m /— ^ 

TTGACAACTG 


GGAAGGCGAT 


GCGGCTACCG 


CTTGCGAGGC 


TTCGCTCGAT 


CAACAACGGC 


780 


AATGGATACT 


CCACATGGCC 


AAATTGAGCG 


CTGCGATGGC 


CAAGCAGGCT 


CAATATGTCG 


840 


CGCAGCTGCA 


CGTGTGGGCT 


AGGCGGGAAC 


ATCCGACTTA 


TGAAGACATA 


GTCGGGCTCG 


900 


AACGGCTTTA 


CGCGGAAAAC 


CCTTCGGCCC 


GCGACCAAAT 


TCTCCCGGTG 


TACGCGGAGT 


960 


ATCAGCAGAG 


GTCGGAGAAG 


GTGCTGACCG 


AATACAACAA 


CAAGGCAGCC 


CTGGAACCGG 


1020 


TAAACCCGCC 


GAAGCCTCCC 


CCCGCCATCA 


AGATCGACCC 


GCCCCCGCCT 


CCGCAAGAGC 


1080 


AGGGATTGAT 


CCCTGGCTTC 


CTGATGCCGC 


CGTCTGACGG 


CTCCGGTGTG 


ACTCCCGGTA 


1140 


CCGGGATGCC 


AGCCGCACCG 


ATGGTTCCGC 


CTACCGGATC 


GCCGGGTGGT 


GGCCTCCCGG 


1200 


CTGACACGGC 


GGCGCAGCTG 


ACGTCGGCTG 


GGCGGGAAGC 


CGCAGCGCTG 


TCGGGCGACG 


1260 


TGGCGGTCAA 


AGCGGCATCG 


CTCGGTGGCG 


GTGGAGGCGG 


CGGGGTGCCG 


TCGGCGCCGT 


1320 


TGGGATCCGC 


GATCGGGGGC 


GCCGAATCGG 


TGCGGCCCGC 


TGGCGCTGGT 


GACATTGCCG 


1380 


GCTTAGGCCA 


GGGAAGGGCC 


GGCGGCGGCG 


CCGCGCTGGG 


CGGCGGTGGC 


ATGGGAATGC 


1440 


CGATGGGTGC 


CGCGCATCAG 


GGACAAGGGG 


GCGCCAAGTC 


CAAGGGTTCT 


CAGCAGGAAG 


1500 
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ACGAGGCGCT 


CTACACCGAG 


GATCGGGCAT 


GGACCGAGGC 


CGTCATTGGT 


AACCGTCGGC 


1560 


GCCAGGACAG 


TAAGGAGTCG 


AAGTGAGCAT 


GGACGAATTG 


GACCCGCATG 


TCGCCCGGGC 


1620 


GTTGACGCTG 


GCGGCGCGGT 


TTCAGTCGGC 


CCTAGACGGG 


ACGCTCAATC 


AG AT G AACAA 


1680 


CGGATCCTTC 


CGCGCCACCG 


ACGAAGCCGA 


GACCGTCGAA 


GTGACGATCA 


ATGGGCACCA 


1740 


GTGGCTCACC 


GGCCTGCGCA 


TCGAAGATGG 


TTTGCTGAAG 


AAGCTGGGTG 


CCGAGGCGGT 


1800 


GGCTCAGCGG 


GTCAACGAGG 


CGCTGCACAA 


TGCGCAGGCC 


GCGGCGTCCG 


CGTATAACGA 


1860 


CGCGGCGGGC 


GAGCAGCTGA 


CCGCTGCGTT 


ATCGGCCATG 


TCCCGCGCGA 


TGAACGAAGG 


1920 


AATGGCCTAA 


GCCCATTGTT 


GCGGTGGTAG 


CGACTACGCA 


CCGAATGAGC 


GCCGCAATGC 


1980 


GGTCATTCAG 


CGCGCCCGAC 


ACGGCGTGAG 


TACGCATTGT 


CAATGTTTTG 


AC AT G GAT C G 


2040 


GCCGGGTTCG 


GAGGGCGCCA 


TAGTCCTGGT 


CGCCAATATT 


GCCGCAGCTA 


GCTGGTCTTA 


2100 


GGTTCGGTTA 


CGCTGGTTAA 


TTATGACGTC 


CGTTACCA 






2138 


(2) INFORMATION FOR SEQ ID NO: 184: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 60 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRI 

Met Thr Gin Ser Gin 
1 5 

Arg Ala Asn Glu Val 
20 



Pro lie Thr Pro Cys 
35 

Leu Val Leu Ser Ala 
50 

Lys Glu Arg Gin Arg 
65 

Tyr Gly Glu Val Asp 
85 

Glu Gly Thr Val Gin 
100 

Ser Ala Glu Leu Thr 
115 

Asn Phe Met Asp Leu 
130 



»TION: SEQ ID NO: 184: 

Thr Val Thr Val Asp Gin 
10 

Glu Ala Pro Met Ala Asp 
25 

Glu Leu Thr Ala Ala Lys 
40 

Asp Asn Met Arg Glu Tyr 
55 

Leu Ala Thr Ser Leu Arg 
70 75 

Glu Glu Ala Ala Thr Ala 
90 

Ala Glu Ser Ala Gly Ala 
105 

Asp Thr Pro Arg Val Ala 
120 

Lys Glu Ala Ala Arg Lys 
135 



Gin Glu lie Leu Asn 
15 

Pro Pro Thr Asp Val 
30 

Asn Ala Ala Gin Gin 
45 

Leu Ala Ala Gly Ala 
60 

Asn Ala Ala Lys Ala 
80 

Leu Asp Asn Asp Gly 
95 

Val Gly Gly Asp Ser 
110 

Thr Ala Gly Glu Pro 
125 

Leu Glu Thr Gly Asp 
140 
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Gin Gly Ala Ser Leu Ala His Phe Ala Asp Gly Trp Asn Thr Phe Asn 
145 150 155 160 

Leu Thr Leu Gin Gly Asp Val Lvs Arg Phe Arg Gly Phe Asp Asn Trp 
165 170 ^ 175 

Glu Gly Asp Ala Ala Thr Ala Cys Glu Ala Ser Leu Asp Gin Gin Arg 
180 185 190 

Gin Trp lie Leu His Met Ala Lys Leu Ser Ala Ala Met Ala Lys Gin 
195 200 205 

Ala Gin Tyr Val Ala Gin Leu His Val Trp Ala Arg Arg Glu His Pro 
210 215 220 

Thr Tyr Glu Asp lie Val Gly Leu Glu Arg Leu Tyr Ala Glu Asn Pro 
225 230 235 240 

Ser Ala Arg Asp Gin lie Leu Pro Val Tyr Ala Glu Tyr Gin Gin Arg 
24 5 250 * 255 

Ser Glu Lys Val Leu Thr Glu Tyr Asn Asn Lys Ala Ala Leu Glu Pro 
260 265 270 

Val Asn Pro Pro Lys Pro Pro Pro Ala lie Lys lie Asp Pro Pro Pro 
275 • 280 285 

Pro Pro Gin Glu Gin Gly Leu lie Pro Gly Phe Leu Met Pro Pro Ser 
290 295 300 

Asp Gly Ser Gly Val Thr Pro Gly Thr Gly Met Pro Ala Ala Pro Met 
305 310 315 320 

Val Pro Pro Thr Gly Ser Pro Gly Gly Gly Leu Pro Ala Asp Thr Ala 
325 330 335 

Ala Gin Leu Thr Ser Ala Gly Arg Glu Ala Ala Ala Leu Ser Gly Asp 
340 345 350 

Val Ala Val Lys Ala Ala Ser Leu Gly Gly Gly Gly Gly Gly Gly Val 
355 360 365 

Pro Ser Ala Pro Leu Gly Ser Ala lie Gly Gly Ala Glu Ser Val Arg 
370 375 380 

Pro Ala Gly Ala Gly Asp lie Ala Gly Leu Gly Gin Gly Arg Ala Gly 
385 390 395 400 

Gly Gly Ala Ala Leu Gly Gly Gly Gly Met Gly Met Pro Met Gly Ala 
405 410 415 

Ala His Gin Gly Gin Gly Gly Ala Lys Ser Lys Gly Ser Gin Gin Glu 
420 425 430 

Asp Glu Ala Leu Tyr Thr Glu Asp Arg Ala Trp Thr Glu Ala Val lie 
435 440 445 

Gly Asn Arg Arg Arg Gin Asp Ser Lys Glu Ser Lys 
450 455 460 

(2) INFORMATION FOR SEQ ID NO: 18 5: 



BNSDOCID: <WO 9816646A2_I_> 



WO 98/16646 PCT/US97/18293 

180 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

Ala Gly Asn Val Thr Ser Ala Ser Gly Pro His Arg Phe Gly Ala Pro 
15 10 15 

Asp Arq Gly Ser Gin Arg Arg Arg Arg His Pro Ala Ala Ser Thr Ala 
20 " 25 30 

Thr Glu Arg Cys Arg Phe Asp Arg His Val Ala Arg Gin Arg Cys Gly 
35 40 45 

Phe Pro Pro Ser Arg Arg Gin Leu Arg Arg Arg Val Ser Arg Glu Ala 
50 55 60 

Thr Thr Arg Arg Ser Gly Arg Arg Asn His Arg Cys Gly Trp His Pro 
65 70 7 5 8 0 

Gly Thr Gly Ser His Thr Gly Ala Val Arg Arg Arg His Gin Glu Ala 
85 90 95 

Arg Asp Gin Ser Leu Leu Leu Arg Arg Arg Gly Arg Val Asp Leu Asp 
100 105 110 

Gly Gly Gly Arg Leu Arg Arg Val Tyr Arg Phe Gin Gly Cys Leu Val 
115 120 125 

Val Val Phe Gly Gin His Leu Leu Arg Pro Leu Leu lie Leu Arg Val 
130 J 135 140 

His Arg Glu Asn Leu Val Ala Gly Arg Arg Val Phe Arg Val Lys Pro 
145 150 155 160 

Phe Glu Pro Asp Tyr Val Phe lie Ser Arg Met Phe Pro Pro Ser Pro 
165 • 170 175 

His Val Gin Leu Arg Asp lie Leu Ser Leu Leu Gly His Arg Ser Ala 
180 185 190 

Gin Phe Gly His Val Glu Tyr Pro Leu Pro Leu Leu lie Glu Arg Ser 
195 200 205 

Leu Ala Ser Gly Ser Arg lie Ala Phe Pro Val Val Lys Pro Pro Glu 
210 " 215 220 

Pro Leu Asp Val Ala Leu Gin Arg Gin Val Glu Ser Val Pro Pro lie 
225 230 235 240 

Arg Lys Val Arg Glu Arg Cys Ala Leu Val Ala Arg Phe Glu Leu Pro 
245 250 255 

Cys Arg Phe Phe Glu lie His Glu Val Gly Phe Thr Gly Arg Gly His 
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260 265 270 

Pro Arg Arg lie Gly 
275 

(2) INFORMATION FOR SEQ ID NO: 18 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Arg Val Ala Ala Ser Phe lie Asp Trp Leu Asp Ser Pro Asp Ser Pro 
1 5 10 15 

Leu Asp Pro Ser Leu Val Ser Ser Leu Leu Asn Ala Val Ser Cys Gly 
20 25 30 

Ala Glu Ser Ser Ala Ser Ser Ser Ala Arg Ser Gly Asn Gly Ser Arg 
35 40 45 

Trp Thr Ser Met Pro Ser Gly Thr Arg Pro Gly Pro Arg Arg Ala Thr 
50 55 60 

Ser Arg Asp Asp Arg Arg Ser Ala Thr Ser Val lie Pro Ser Arg Arg 
65 70 75 80 

Ser Val Ala Pro Arg Ala Glu Phe Gly Thr Arg Leu Ala Ser His Arg 
85 90 95 

Ala Ser Pro Ser Asn Ala Cys Pro Val Arg lie Val Thr Ser Ala Ser 
100 105 110 

Gly Arg Pro lie Ser Ser Pro Pro lie Val Arg Ser Arg Ser. Cys Val 
115 120 125 

Asp Lys Asn Gly Arg Arg Cys Ala Ser Gly Tyr Arg Arg Leu Asn Arg 
130 135 140 

Ala Arg Ser Ser Ser lie Ala Ala Arg Cys Arg Thr lie Gly Thr Phe 
145 150 155 160 

Arg Arg Ser Arg Tyr Ser Ala Ser Met Arg Val Ser Thr Asn Ser Pro 
165 170 175 

His Val Thr His Gly Val Ala Pro Gly Val Thr Arg Arg lie Gly Gly 
180 185 190 



(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 196 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

Gin Glu Arg Pro Gin Met Cys Gin Arg Val Ser Glu lie Glu Pro Arg 
1 5 10 15 

Thr Gin Phe Phe Asn Arg Cys Ala Leu Pro His Tyr Trp His Phe Pro 
20 25 30 

Ala Val Ala Val Phe Ser Lys His Ala Ser Leu Asp Glu Leu Ala Pro 
35 40 45 

Arg Asn Pro Arg Arg Ser Ser Arg Arg Asp Ala Glu Asp Arg Arg Val 
50 55 60 

lie Phe Ala Ala Thr Leu Val Ala Val Asp Pro Pro Leu Arg Gly Ala 
65 70 75 80 

Gly Gly Glu Ala Asp Gin Leu lie Asp Leu Gly Val Cys Arg Arg Gin 
85 90 95 

Ala Gly Arg Val Arg Arg Gly Gin Glu Leu His His Arg His Arg His 
100 105 110 

Gin Gly Ala Ala Pro Asp Leu Arg Arg Arg Arg Arg His Arg Arg Val 
115 120 125 

Gin Gin His Arg Arg Leu Gin Arg Val Arg Gin Leu Arg Arg Tyr Val 
130 135 140 

Gin Thr Ala His His Arg Arg Phe Ala Arg Thr Asp Arg Val Arg His 
145 150 155 160 

His Val Arg Gly Pro Ser Asn His Arg Arg Arg Arg Val Tyr Arg Gly 
165 170 175 

Arg His Ser Gly Ala Gly Gly Cys Pro Ala Gly Gly Ala Gly Ser Val 
180 185 190 

Gly Gly Ser Ala 
195 

(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



.(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 
Val Arg Cys Gly Thr Leu Val Pro Val Pro Met Val Glu Phe Leu Thr 
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5 



10 



15 



Ser Thr Asn Ala Pro Ser Leu Pro Ser Ala Tyr Ala Glu Val Asp Lys 
20 25 30 

Leu lie Gly Leu Pro Ala Gly Thr Ala Lys Arg Trp lie Asn Gly Tyr 
35" 40 45 

Glu Arg Gly Gly Lys Asp His Pro Pro lie Leu Arg Val Thr Pro Gly 
50 55 60 

Ala Thr Pro Trp Val Thr Trp Gly Glu Phe Val Glu Thr Arg Met Leu 
65 70 75 80 

• Ala Glu Tyr Arg Asp Arg Arg Lys Val Pro lie Val Arg Gin Arg Ala 
85 90 95 

Ala lie Glu Glu Leu Arg Ala Arg Phe Asn Leu Arg Tyr Pro Leu Ala 
100 105 110 

His Leu Arg Pro Phe Leu Ser Thr His Glu Arg Asp Leu Thr Met Gly 
115 120 125 

Gly Glu Glu lie Gly Leu Pro Asp Ala Glu Val Thr lie Arg Thr Gly 
130 135 .140 

Gin Ala Leu Leu Gly Asp Ala Arg Trp Leu Ala Ser Leu Val Pro Asn 
145 150 155 160 

Ser Ala Arg Gly Ala Thr Leu Arg Arg Leu Gly lie Thr Asp Val Ala 
165 170 175 

Asp Leu Arg Ser Ser Arg Glu Val Ala Arg Arg Gly Pro Gly Arg Val 
180 185 190 

Pro Asp Gly lie Asp Val His Leu Leu Pro Phe Pro Asp Leu Ala Asp 
195 200 205 

Asp Asp Ala Asp Asp Ser Ala Pro His Glu Thr Ala Phe Lys Arg Leu 
210 215 220 

Leu Thr Asn Asp Gly Ser Asn Gly Glu Ser Gly Glu Ser Ser Gin Ser 
225 230 235 240 

lie Asn Asp Ala Ala Thr Arg Tyr Met Thr Asp Glu Tyr Arg Gin Phe 
245 250 255 

Pro Thr Arg Asn Gly Ala Gin Arg Ala Leu His Arg Val Val Thr Leu 
260 265 270 

Leu Ala Ala Gly Arg Pro Val Leu Thr His Cys Phe Ala Gly Lys Asp 
275 280 - 285 

Arg Thr Gly Phe Val Val Ala Leu Val Leu Glu Ala Val Gly Leu Asp 
290 295 300 

Arg Asp Val He Val Ala Asp 
305 310 

(2) INFORMATION FOR SEQ ID NO: 18 9: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) 
(B) 
(C) 
(D) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 
CTCGTGCCGA TTCGGCACGA GCTGAGCAGC CCAAGGGGCC GTTCGGCGAA GTCATCGAGG 60 
CATTCGCCGA CGGGCTGGCC GGCAAGGGTA AGCAAATCAA CACCACGCTG AACAGCCTGT 120 
CGCAGGCGTT GAACGCCTTG AATGAGGGCC GCGGCGACTT CTTCGCGGTG GTACGCAGCC 180 
TGGCGCTATT CGTCAACGCG CTACATCAGG ACGACCAACA GTTCGTCGCG TTGAACAAGA 24 0 

ACCTTGCGGA GTTCACCGAC AGGTTGACCC ACTCCGATGC GGACCTGTCG AACGCCATCC 300 
AGCAATTCGA CAGCTTGCTC GCCGTCGCGC GCCCGTTCTT CGCCAAGAAC CGCGAGGTGC 360 
TGACGCATGA CGTCAATAAT CTCGCGACCG TGACCACCAC GTTGCTGCAG CCCGATCCGT 4 20 

TGGATGGGTT GGAGACCGTC CTGCACATCT TCCCGACGCT GGCGGCGAAC ATTAACCAGC 4 80 

TTTACCATCC GACACACGGT GGCGTGGTGT CGCTTTCCGC GTTCACGAAT TTCGCCAACC 54 0 

CGATGGAGTT CATCTGCAGC TCGATTCAGG CGGGTAGCCG GCTCGGTTAT CAAGAGTCGG 600 
CCGAACTCTG TGCGCAGTAT CTGGCGCCAG TCCTCGATGC GATCAAGTTC AACTACTTTC 660 
CGTTCGGCCT GAACGTGGCC AGCACCGCCT CGACACTGCC TAAAGAGATC GCGTACTCCG 720 
AGCCCCGCTT GCAGCCGCCC AACGGGTACA AGGACACCAC GGTGCCCGGC ATCTGGGTGC 7 80 

CGGATACGCC GTTGTCACAC CGCAACACGC AGCCCGGTTG GGTGGTGGCA CCCGGGATGC 84 0 

AAGGGGTTCA GGTGGGACCG ATCACGCAGG GTTTGCTGAC GCCGGAGTCC CTGGCCGAAC 900 
TCATGGGTGG TCCCGATATC GCCCCTCCGT CGTCAGGGCT GCAAACCCCG CCCGGACCCC 960 

CGAATGCGTA C G AC GAG T AC CCCGTGCTGC CGCCGATCGG TTTACAGGCC CCACAGGTGC 1020 

CGATACCACC GCCGCCTCCT GGGCCCGACG TAATCCCGGG TCCGGTGCCA CCGGTCTTGG 1080 

CGGCGATCGT GTTCCCAAGA GATCGCCCGG CAGCGTCGGA AAACTTCGAC TACATGGGCC 114 0 

TCTTGTTGCT GTCGCCGGGC CTGGCGACCT TCCTGTTCGG GGTGTCATCT AGCCCCGCCC 1200 

GTGGAACGAT GGCCGATCGG CACGTGTTGA TACCGGCGAT CACCGGCCTG GCGTTGATCG 12 60 

CGGCATTCGT CGCACATTCG TGGTACCGCA C AG AAC AT C C GCTCATAGAC ATGCGCTTGT 1320 

TCCAGAACCG AGCGGTCGCG C AG G C C AAC A TGACGATGAC GGTGCTCTCC CTCGGGCTGT 138 0 

TTGGCTCCTT CTTGCTGCTC CCGAGCTACC TCCAGCAAGT GTTGCACCAA TCACCGATGC 14 4 0 

AATCGGGGGT GCATATCATC CCACAGGGCC TCGGTGCCAT GCTGGCGATG CCGATCGCCG 1500 

GAGCGATGAT GGACCGACGG GGACCGGCCA AGATCGTGCT GGTTGGGATC ATGCTGATCG 1560 



LENGTH: 2072 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
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CTGCGGGGTT 


GGGCACCTTC 


GCCTTTGGTG 


TCGCGCGGCA 


AGCGGACTAC 


TTACCCATTC 


1620 


TGCCGACCGG 


GCTGGCAATC 


ATGGGCATGG 


GCATGGGCTG 


CTCCATGATG 


CCACTGTCCG 


1680 


GGGCGGCAGT 


GCAGACCCTG 


GCCCCACATC 


AGATCGCTCG 


CGGTTCGACG 


CTGATCAGCG 


1740 




P,P,TP.P.GCGGT 


TCGATAGGGA 


CCGCACTGAT 


GTCGGTGCTG 


CTCACCTACC 


1800 


AGTTCAATCA 


CAGCGAAATC 


ATCGCTACTG 


CAAAGAAAGT 


CGCACTGACC 


CCAGAGAGTG 


1860 


GCGCCGGGCG 


GGGGGCGGCG 


GTTGACCCTT 


CCTCGCTACC 


GCGCCAAACC 


AACTTCGCGG 


1920 


CCCAACTGCT 


GCATGACCTT 


TCGCACGCCT 


ACGCGGTGGT 


ATTCGTGATA 


GCGACCGCGC 


1980 


TAGTGGTCTC 


GACGCTGATC 


CCCGCGGCAT 


TCCTGCCGAA 


ACAGCAGGCT 


AGTCATCGAA 


2040 


GAGCACCGTT 


GCTATCCGCA 


TGACGTCTGC 


TT 






2072 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1923 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 



TCACCCCGGA 


GAAGTCGTTC 


GTCGACGACC 


TGGACATCGA 


CTCGCTGTCG 


ATGGTCGAGA 


60 


TCGCCGTGCA 


GACCGAGGAC 


AAGTACGGCG 


TCAAGATCCC 


CGACGAGGAC 


CTCGCCGGTC 


120 


TGCGTACCGT 


CGGTGACGTT 


GTCGCCTACA 


TCCAGAAGCT 


CGAGGAAGAA 


AACCCGGAGG 


180 


CGGCTCAGGC 


GTTGCGCGCG 


AAGATTGAGT 


CGGAGAACCC 


CGATGCGGCA 


CGAGCAGATC 


240 


GGTGCGTTTC 


ACCCACATCG 


CAAGCTCGAG 


ACGCCCGTCG 


TCCTCTTGCA 


CGCTCAGCCA 


300 


GGTTGGCGTG 


TCGCCGCCTT 


CCAGCAAGTG 


TTCCCACCAC 


ACGAAGGGAC 


CCTCGCGAAA 


360 


GGTGACTGAT 


CCGCGGACCA 


CATAGTCGAT 


GCCACCGTGG 


CTGACAATTG 


CGCCGGGTCC 


420 


GAGTTGGCGG 


GGGCCGAATT 


GCGGCATTGC 


GTCGAAGGCC 


AGCGGATCCC 


GGCGCCCGCC 


480 


CGGCGTGGCT 


GGTGTTTTGG 


GCCGCCGGAT 


GGCCACGACG 


AGAACGACGA 


TGGCGGCGAT 


540 


GAACAGCGCC 


ACGGCAATCA 


CGACCAGCAG 


ATTTCCCACG 


CATACCCTCT 


CGTACCGCTG 


600 


CGCCGCGGTT 


GGTCGATCGG 


TCGCATATCG 


ATGGCGCCGT 


TTAACGTAAC 


AGCTTTCGCG 


660 


GGACCGGGGG 


TCACAACGGG 


CGAGTTGTCC 


GGCCGGGAAC 


CCGGCAGGTC 


TCGGCCGCGG 


720 


TCACCCCAGC 


TCACTGGTGC 


ACCATCCGGG 


TGTCGGTGAG 


CGTGCAACTC 


AAACACACTC 


780 


AACGGCAACG 


GTTTCTCAGG 


TCACCAGCTC 


AACCTCGACC 


CGCAATCGCT 


CGTACGTTTC 


840 


GACCGCGCGC 


AGGTCGCGAG 


TCAGCAGCTT 


TGCGCCGGCA 


GCTTTCGCCG 


TGAAGCCGAC 


900 
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CAGGGCATCG TAGGTTGCGC CACCGGTGAC ATCGTGCTCG GCGAGGTGGT CGGTCAAGCC 960 

GCGATATGAG CAGGCATCCA GTGCCAGGTA GTTGCTGGAG GTGATGTCCG CCAAGTAGGC 1020 

GTGGACGGCA ACAGGGGCAA TACGATGCGG CGGTGGTAGC CGGGTCAAGA CCGAATAGGT 108 0 

TTCCACAGCC GCGTGCGCGA TCAGATGGAC GCCACGGTTG AGCGCGCGCA CGGCGGCCTC 114 0 

GTGCCCTTCG TGCCAGGTCG CGAATCCGGC AACCAGCACG CTGGTGTCTG GTGCGATCAC 1200 

CGCCGTGTGC GATCGAGCGT TTCCCGAACG ATTTCGTCGG TCAACGGGGG CAGGGGACGT 12 60 

TCTGGCCGTG CGACGAGAAC CGAGCCTTCC CGAACGAGTT CGACACCGGT CGGGGCCGGC 132 0 

TCAATCTCGA TGCGCCCATC GCGCTCGGTG ATCTCCACCT GGTCGTTCCC GCGCAAGCCA 1380 

AGGCGCTCGC GAATCCGCTT GGGAATCACC AGACGTCCTG CGACATCGAT GGTTGTTCGC 14 4 0 

ATGGTAGGAA ATTTACCATC GCACGTTCCA TAGGCGTGTC CTGCGCGGGA TGTCGGGACG 1500 

ATCCGCTAGC GTATCGAACG ATTGTTTCGG AAATGGCTGA GGGAGCGTGC GGTGCGGGTG 15 60 

ATGGGTGTCG ATCCCGGGTT GACCCGATGC GGGCTGTCGC TCATCGAGAG TGGGCGTGGT 1620 

CGGCAGCTCA CCGCGCTGGA TGTCGACGTG GTGCGCACAC CGTCGGATGC GGCCTTGGCG 168 0 

CAGCGCCTGT TGGCCATCAG CGATGCCGTC GAGCACTGGC TGGACACCCA TCATCCGGAG 174 0 

GTGGTGGCTA TCGAACGGGT GTTCTCTCAG CTCAACGTGA CCACGGTGAT GGGCACCGCG 1800 

CAGGCCGGCG GCGTGATCGC CCTGGCGGCG GCCAAACGTG GTGTCGACGT GCATTTCCAT 18 60 

ACCCCCAGCG AGGTCAAGGC GGCGGTCACT GGCAACGGTT CCGCAGACAA GGCTCAGGTC 1920 

ACC 192 3 
(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1055 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

CTGGCGTGCC AGTGTCACCG GCGATATGAC GTCGGCATTC AATTTCGCGG CCCCGCCGGA 60 

CCCGTCGCCA CCCAATCTGG ACCACCCGGT CCGTCAATTG CCGAAGGTCG CCAAGTGCGT 120 

GCCCAATGTG GTGCTGGGTT TCTTGAACGA AGGCCTGCCG TATCGGGTGC CCTACCCCCA 180 

AACAACGCCA GTCCAGGAAT CCGGTCCCGC GCGGCCGATT CCCAGCGGCA TCTGCTAGCC 24 0 

GGGGATGGTT C AG AC G T AAC GGTTGGCTAG GTCGAAACCC GCGCCAGGGC CGCTGGACGG 300 

GCTCATGGCA GCGAAATTAG AAAACCCGGG ATATTGTCCG CGGATTGTCA TACGATGCTG 360 
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AGTGCTTGGT GGTTCGTGTT TAGCCATTGA GTGTGGATGT GTTGAGACCC TGGCCTGGAA 4 20 

GGGGACAACG TGCTTTTGCC TCTTGGTCCG CCTTTGCCGC CCGACGCGGT GGTGGCGAAA 4 80 

CGGGCTGAGT CGGGAATGCT CGGCGGGTTG TCGGTTCCGC TCAGCTGGGG AGTGGCTGTG 540 

CCACCCGATG ATTATGACCA CTGGGCGCCT GCGCCGGAGG ACGGCGCCGA TGTCGATGTC 600 

CAGGCGGCCG AAGGGGCGGA CGCAGAGGCC GCGGCCATGG ACGAGTGGGA TGAGTGGCAG 660 

GCGTGGAACG AGTGGGTGGC GGAGAACGCT GAACCCCGCT TTGAGGTGCC ACGGAGTAGC 720 

AGCAGCGTGA TTCCGCATTC TCCGGCGGCC GGCTAGGAGA GGGGGCGCAG ACTGTCGTTA 780 

TTTGACCAGT GATCGGCGGT CTCGGTGTTC CCGCGGCCGG CTATGACAAC AGTCAATGTG 840 

CATGACAAGT TACAGGTATT AGGTCCAGGT TCAACAAGGA GACAGGCAAC ATGGCAACAC 900 

GTTTTATGAC GGATCCGCAC GCGATGCGGG ACATGGCGGG CCGTTTTGAG GTGCACGCCC 960 

AGACGGTGGA GGACGAGGCT CGCCGGATGT GGGCGTCCGC GCAAAACATC TCGGGNGCGG 1020 
GCTGGAGTGG CATGGCCGAG GCGACCTCGC TAGAC 1055 
(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
.(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 



CCGCCTCGTT 


GTTGGCATAC 


TCCGCCGCGG 


CCGCCTCGAC 


CGCACTGGCC 


GTGGCGTGTG 


60 


TCCGGGCTGA 


CCACCGGGAT 


CGCCGAACCA 


TCCGAGATCA 


CCTCGCAATG 


ATCCACCTCG 


120 


CGCAGCTGGT 


CACCCAGCCA 


CCGGGCGGTG 


TGCGACAGCG 


CCTGCATCAC 


CTTGGTATAG 


180 


CCGTCGCGCC 


CCAGCCGCAG 


GAAGTTGTAG 


TACTGGCCCA 


CCACCTGGTT 


ACCGGGACGG 


240 


GAGAAGTTCA 


GGGTGAAGGT 


CGGCATGTCG 


CCGCCGAGGT 


AGTTGACCCG 


GAAAACCAGA 


300 


TCCTCCGGCA 


GGTGCTCGGG 


CCCGCGCCAC 


ACGACAAACC 


CGACGCCGGG 


ATAGGTCAG 


359 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 
AACGGGCCCG TGGGCACCGC TCCTCTAAGG GCTCTCGTTG GTCGCATGAA GTGCTGGAAG 
GATGCATCTT GGCAGATTCC CGCCAGAGCA AAACAGCCGC TAGTCCTAGT CCGAGTCGCC 
CGCAAAGTTC CTCGAATAAC TCCGTACCCG GAGCGCCAAA CCGGGTCTCC TTCGCTAAGC 
TGCGCGAACC ACTTGAGGTT CCGGGACTCC TTGACGTCCA GACCGATTCG TTCGAGTGGC 
TGATCGGTTC GCCGCGCTGG CGCGAATCCG CCGCCGAGCG GGGTGATGTC AACCCAGTGG 
GTGGCCTGGA AGAGGTGCTC TACGAGCTGT CTCCGATCGA GGACTTCTCC 
(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Glu Gin Pro Lys Gly Pro Phe Gly Glu Val He Glu Ala Phe Ala Asp 
1 5 10 i5 

Glv Leu Ala Gly Lys Gly Lys Gin He Asn Thr Thr Leu Asn Ser Leu 
20 25 30 

Ser Gin Ala Leu Asn Ala Leu Asn Glu Gly Arg Gly Asp Phe Phe Ala 
35 40 45 

Val Val Arg Ser Leu Ala Leu Phe Val Asn Ala Leu His Gin Asp Asp 
50 55 60 

Gin Gin Phe Val Ala Leu Asn Lys Asn Leu Ala Glu Phe Thr Asp Arg 
65 70 75 80 

Leu Thr His Ser Asp Ala Asp Leu Ser Asn Ala He Gin Gin Phe Asp 
85 90 95 

Ser Leu Leu Ala Val Ala Arg Pro Phe Phe Ala Lys Asn Arg Glu Val 
100 105 HO 

Leu Thr His Asp Val Asn Asn Leu Ala Thr Val Thr Thr Thr Leu Leu 
115 " 120 125 

Gin Pro Asp Pro Leu Asp Gly Leu Glu Thr Val Leu His He Phe Pro 
130 135 140 

Thr Leu Ala Ala Asn He Asn Gin Leu Tyr His Pro Thr His Gly Gly 
145 150 155 160 

Val Val Ser Leu Ser Ala Phe Thr Asn Phe Ala Asn Pro Met Glu Phe 
165 170 175 
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He Cys Scr Ser He Gin Ala Gly Ser Arg Leu GJ v Tyr Gin Glu Ser 
180 185 190 

Ala Glu Leu Cys Ala Gin Tyr Leu Ala Pro Val Leu Asp Ala He Lys 
195 " 200 205 

Phe Asn Tyr Phe Pro Phe Gly Leu Asn Val Ala Ser Thr Ala Ser Thr 
210 215 220 

Leu Pro Lys Glu He Ala Tyr Ser Glu Pro Arg Leu Gin Pro Pro Asn 
225 230 235 240 

Gly Tyr Lys Asp Thr Thr Val Pro Gly He Trp Val Pro Asp Thr Pro 
24 5 250 255 

Leu Ser His Arg Asn Thr Gin Pro Gly Trp Val. Val Ala Pro Gly Met 
260 265 270 

Gin Gly Val Gin Val Gly Pro He Thr Gin Gly Leu Leu Thr Pro Glu 
275 280 285 

Ser Leu Ala Glu Leu Met Gly Gly Pro Asp He Ala Pro Pro Ser Ser 
290 295 300 

Gly Leu Gin Thr Pro Pro Gly Pro Pro Asn Ala Tyr Asp Glu Tyr Pro 
305 310 315 320 

Val Leu Pro Pro He Gly Leu Gin Ala Pro Gin Val Pro He Pro Pro 
325 330 335 

Pro Pro Pro Gly Pro Asp Val He Pro Gly Pro Val Pro Pro Val Leu 
340 345 350 

Ala Ala He Val Phe Pro Arg Asp Arg Pro Ala Ala Ser Glu Asn Phe 
355 360 365 

Asp Tyr Met Gly Leu Leu Leu Leu Ser Pro Gly Leu Ala Thr Phe Leu 
370 375 380 

Phe Gly Val Ser Ser Ser Pro Ala Arg Gly Thr Met Ala Asp Arg His 
385 390 395 400 

Val Leu He Pro Ala He Thr Gly Leu Ala Leu He Ala Ala Phe Val 
405 410 415 

Ala His Ser Trp Tyr Arg Thr Glu His Pro Leu He Asp Met Arg Leu 
420 425 430 

Phe Gin Asn Arg Ala Val Ala Gin Ala Asn Met Thr Met Thr Val Leu 
435 440 445 

Ser Leu Gly Leu Phe Gly Ser Phe Leu Leu Leu Pro Ser Tyr Leu Gin 
450 ~ 455 460 

Gin Val Leu His Gin Ser Pro Met Gin Ser Gly Val His He He Pro 
465 470 475 480 

Gin Gly Leu Gly Ala Met Leu Ala Met Pro He Ala Gly Ala Met Met 
485 490 495 

Asp Arg Arg Gly Pro Ala Lys He Val Leu Val Gly He Met Leu He 
500 505 510 
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Ala Ala Gly Leu Gly Thr Phe Ala Phe Gly Val Ala Arg Gin Ala Asp 
515 520 525 

Tyr Leu Pro lie Leu Pro Thr Gly Leu Ala He Met Gly Met Gly Met 
530 535 540 

Gly Cys Ser Met Met Pro Leu Ser Gly Ala Ala Val Gin Thr Leu Ala 
545 550 555 560 

Pro His Gin He Ala Arg Gly Ser Thr Leu He Ser Val Asn Gin Gin 
565 570 575 

Val Gly Gly Ser He Gly Thr Ala Leu Met Ser Val Leu Leu Thr Tyr 
580 585 590 

Gin Phe Asn His Ser Glu He He Ala Thr Ala Lys Lys Val Ala Leu 
595 600 605 

Thr Pro Glu Ser Gly Ala Gly Arg Gly Ala Ala Val Asp Pro Ser Ser 
610 615 620 

Leu Pro Arg Gin Thr Asn Phe Ala Ala Gin Leu Leu His Asp Leu Ser 
6 25 630 635 640 

His Ala Tyr Ala Val Val Phe Val He Ala Thr Ala Leu Val Val Ser 
645 650 655 

Thr Leu He Pro Ala Ala Phe Leu Pro Lys Gin Gin Ala Ser His Arg 
660 665 67 0 

Arg Ala Pro Leu Leu Ser Ala 
67 5 

(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

Thr Pro Glu Lys Ser Phe Val Asp Asp Leu Asp He Asp Ser Leu Ser 
15 10 15 

Met Val Glu He Ala Val Gin Thr Glu Asp Lys Tyr Gly Val Lys He 
20 25 30 

Pro Asp Glu Asp Leu Ala Gly Leu Arg Thr Val Gly Asp Val Val Ala 
35 " 40 45 

Tyr He Gin Lys Leu Glu Glu Glu Asn Pro Glu Ala Ala Gin Ala Leu 
50 " 55 60 

Arq Ala Lys He Glu Ser Glu Asn Pro Asp Ala Ala Arg Ala Asp Arg 
65 70 75 80 
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Cys Val Ser Pro Thr Ser Gin Ala Arg Asp Ala Arg Arq Pro Leu Ala 
85 90 95 

Arg Ser Ala Arg Leu Ala Cys Arg Arg Leu Pro Ala Ser Val Pro Thr 
100 105 110 

Thr Arg Arg Asp Pro Arg Glu Arg 
115 120 

(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

Leu Ala Cys Gin Cys His Arg Arg Tyr Asp Val Gly lie Gin Phe Arg 
15 10 15 

Gly Pro Ala Gly Pro Val Ala Thr Gin Ser Gly Pro Pro Gly Pro Ser 
20 25 30 

lie Ala Glu Gly Arg Gin Val Arg Ala Gin Cys Gly Ala Gly Phe Leu 
35 4 0 4 5 

Glu Arg Arg Pro Ala Val Ser Gly Ala Leu Pro Pro Asn Asn Ala Ser 
50 55 60 

Pro Gly lie Arg Ser Arg Ala Ala Asp Ser Gin Arg His Leu Leu Ala 
65 70 75 80 

Gly Asp Gly Ser Asp Val Thr Val Gly 
85 

(2). INFORMATION FOR SEQ ID NO: 197: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

Ala Ser Leu Leu Ala Tyr Ser Ala Ala Ala Ala Ser Thr Ala Leu Ala 
15 10 15 

Val Ala Cys Val Arg Ala Asp His Arg Asp Arg Arg Thr lie Arg Asp 
20 25 " " 30 
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His Leu Ala Met He His Leu Ala Gin Leu Val Thr Gin Pro Pro Gly 
35 40 45 

Gly Val Arg Gin Arg Leu His His Leu Gly He Ala Val Ala Pro Gin 
50 " 55 60 

Pro Gin Glu Val Val Val Leu Ala His His Leu Val Thr Gly Thr Gly 
65 70 75 80 

Glu Val Gin Gly Glu Gly Arg His Val Ala Ala Glu Val Val Asp Pro 
85 90 95 

Glu Asn Gin He Leu Arg Gin Val Leu Gly Pro Ala Pro His Asp Lys 
100 105 HO 

Pro Asp Ala Gly He Gly Gin 
115 

(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Arg Ala Arg Gly His Arg Ser Ser Lys Gly Ser Arg Trp Ser His Glu 
1 5 10 15 

Val Leu Glu Gly Cys He Leu Ala Asp Ser Arg Gin Ser Lys Thr Ala 
20 25 30 

Ala Ser Pro Ser Pro Ser Arg Pro Gin Ser Ser Ser Asn Asn Ser Val 
35 40 45 

Pro Gly Ala Pro Asn Arg Val Ser Phe Ala Lys Leu Arg Glu Pro Leu 
50 55 60 

Glu Val Pro Gly Leu Leu Asp Val Gin Thr Asp Ser Phe Glu Trp Leu 
65 ' 70 75 80 

He Gly Ser Pro Arg Trp Arg Glu Ser Ala Ala Glu Arg Gly Asp Val 
85 90 95 

Asn Pro Val Gly Gly Leu Glu Glu Val Leu Tyr Glu Leu Ser Pro He 
100 J 105 HO 

Glu Asp Phe Ser 
115 

(2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 811 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



BNSDOCID: <WO 9816646A2_I_> 



WO 98/16646 



PCT7US97/18293 



193 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 



TGCTACGCAG 


CAATCGCTTT 


GGTGACAGAT 


GTGGATGCCG 


GCGTCGCTGC 


TGGCGATGGC 


60 


GTGAAAGCCG 


CCGACGTGTT 


CGCCGCATTC 


GGGGAGAACA 


TCGAACTGCT 


CAAAAGGCTG 


120 


GTGCGGGCCG 


CCATCGATCG 


GGTCGCCGAC 


GAGCGCACGT 


GCACGCACTG 


TCAACACCAC 


180 


GCCGGTGTTC 


CGTTGCCGTT 


CGAGCTGCCA 


TGAGGGTGCT 


GCTGACCGGC 


GCGGCCGGCT 


240 


TCATCGGGTC 


GCGCGTGGAT 


GCGGCGTTAC 


GGGCTGCGGG 


TCACGACGTG 


GTGGGCGTCG 


300 


ACGCGCTGCT 


GCCCGCCGCG 


CACGGGCCAA 


ACCCGGTGCT 


GCCACCGGGC 


TGCCAGCGGG 


360 


TCGACGTGCG 


CGACGCCAGC 


GCGCTGGCCC 


CGTTGTTGGC 


CGGTGTCGAT 


CTGGTGTGTC 


420 


ACCAGGCCGC 


CATGGTGGGT 


GCCGGCGTCA 


ACGCCGCCGA 


CGCACCCGCC 


TATGGCGGCC 


480 


ACAACGATTT 


CGCCACCACG 


GTGCTGCTGG 


CGCAGATGTT 


CGCCGCCGGG 


GTCCGCCGTT 


540 


TGGTGCTGGC 


GTCGTCGATG 


GTGGTTTACG 


GGCAGGGGCG 


CTATGACTGT 


CCCCAGCATG 


600 


GACCGGTCGA 


CCCGCTGCCG 


CGGCGGCGAG 


CCGACCTGGA 


CAATGGGGTC 


TTCGAGCACC 


660 


GTTGCCCGGG 


GTGCGGCGAG 


CCAGTCATCT 


GGCAATTGGT 


CGACGAAGAT 


GCCCCGTTGC 


720 


GCCCGCGCAG 


CCTGTACGCG 


GCAGCAAGAC 


CGCGCAGGAG 


CACTACGCGC 


TGGCGTGGTC 


780 


GGAAACGAAT 


GGCGGTTCCG 


TGGTGGCGTT 


G 






811 



(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 966 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:200: 

GTCCCGCGAT GTGGCCGAGC ATGACTTTCG GCAACACCGG CGTAGTAGTC GAAGATATCG 60 

GACTTTGTGG TCCCGGTGGC GGGATAGAGC ACCTGTCGGC GTTGGTCAGC GTCACCCGTT 120 

GCTCGGACGC CGAACCCATG CTTTCAACGT AGCCTGTCGG TCACACAAGT CGCGAGCGTA 180 

ACGTCACGGT CAAATATCGC GTGGAATTTC GCCGTGACGT TCCGCTCGCG GACAATCAAG 24 0 

GCATACTCAC TTACATGCGA GCCATTTGGA CGGGTTCGAT CGCCTTCGGG CTGGTGAACG 300 

TGCCGGTCAA GGTGTACAGC GCTACCGCAG ACCACGACAT CAGGTTCCAC CAGGTGCACG 3 60 
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CCAAGGACAA 


CGGACGCATC 


CGGTACAAGC 


GCGTCTGCGA 


GGCGTGTGGC 


GAGGTGGTCG 


4 20 


ACTACCGCGA 


TCTTGCCCGG 


GCCTACGAGT 


CCGGCGACGG 


CCAAATGGTG 


GCGATCACCG 


4 80 


ACGACGACAT 


CGCCAGCTTG 


CCTGAAGAAC 


GCAGCCGGGA 


GATCGAGGTG 


TTGGAGTTCG 


54 0 


TCCCCGCCGC 


CGACGTGGAC 


CCGATGATGT 


TCGACCGCAG 


CTACTTTTTG 


GAGCCTGATT 


600 


CGAAGTCGTC 


GAAATCGTAT 


GTGCTGCTGG 


CTAAGACACT 


CGCCGAGACC 


GACCGGATGG 


660 


CGATCGTGGA 


TCGCCCCACC 


GGCCGTGAAT 


GCAGGAAAAA 


TAAGAGCCGC 


TATCCACAAT 


720 


TCGGCGTCGA 


GCTCGGCTAC 


CACAAACGGT 


AGAACGATCG 


AGACATTCCC 


GAGCTGAAGT 


780 


GCGGCGCTAT 


AGAAGCCGCT 


CTGCGCGATT 


ATCAAACGCA 


AAATACGCTT 


ACTCATGCCA 


840 


TCGGCGCTGC 


TCACCCGATG 


CGACGTTTTT 


GCCACGCTCC 


ACCGCCTGCC 


GCGCGACCTC 


900 


AAGTGGGCAT 


GCATCCCACC 


CGTTCCCGGA 


AACCGGTTCC 


GGCGGGTCGG 


CTCATCGCTT 


960 


CATCCT 












966 



(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 



CCGCACCGCC 


GGCAATACCG 


CCAGCGCCAC 


CGTTACCGCC 


GTTTGCGCCG 


TTGCCCCCGT 


60 


TGCCGCCCGT 


CCCGCCGGCC 


CCGCCGATGG 


AGTTCTCATC 


GCCAAAAGTA 


CTGGCGTTGC 


120 


CACCGGAGCC 


GCCGTTGCCG 


CCGTCACCGC 


CAGCCCCGCC 


GACTCCACCG 


GCCCCACCGA 


180 


CTCCGCCGCT 


GCCACCGTTG 


CCGCCGTTGC 


CGATCAACAT 


GCCGCTGGCG 


CCACCCTTGC 


240 


CACCCACGCC 


ACCGGCTCCG 


CCCACCCCGC 


CGACACCAAG 


CGAGCTGCCG 


CCGGAGCCAC 


300 


CATCACCACC 


TACGCCACCG 


ACCGCCCAGA 


CACCAGCGAC 


CGGGTCTTCG 


TGAAACGTCG 


360 


CGGTGCCACC 


ACCGCCGCCG 


TTACCGCCAA 


CCCCACCGGC 


AACGCCGGCG 


CCGCCATCCC 


420 


CGCCGGCCCC 


GGCGTTGCCG 


CCGTTGCCGC 


CGTTGCCGAA 


CAACAACCCG 


CCGGCGCCGC 


480 


CGTTGCCGCC 


CGCGCCGCCG 


GTCCCGCCGG 


CGCCGCCGAC 


GCCAAGGCCG 


CTGCCGCCCT 


540 


TGCCGCCATC 


ACCACCCTTG 


CCGCCGACCA 


CATCGGGTTC 


TGCCTCGGGG 


TCTGGGCTGT 


600 


CAAACCTCGC 


GATGCCAGCG 


TTGCCGCCGC 


TTCCCCCGGG 


CCCCCCCGTG 


GCGCCGTCAC 


660 


CACCGATACC 


ACCCGCGCCA 


CCGGCGCCAC 


CGTTGCCGCC 


AT C ACCGAAT 


AGCAACCCGC 


720 


CGGCGCCACC 


ATTGCCGCCA 


GCTCCCCCTG 


CGCCACCGTC 


GGCGCCGGAG 


GCGGCACTGG 


780 
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CAGCCCCGTT 


ACCACCGAAA 


CCGCCGCTAC 


CACCGGTAGA 


GGTGGCAGTG 


GCGATGTGTA 


840 


CGAAAGCGCC 


GCCTCCGGCG 


CCGCCGCTAC 


CACCCCCACT 


GCCGGCGGCT 


ACACCGTCGG 


900 


ACCCGTTGCC 


ACCATCACCG 


CCAAAGGCGC 


TCGCAATGTC 


GCCCTGCGCG 


ACTCCGCCGT 


960 


CGCCGCCGTT 


GCCGCCGCCG 


CCACCGGCAG 


CGGCGGTACC 


GCCGTCACCA 


CCGGCACCGC 


1020 


CGGTGGCCTT 


GCCCGAGCCT 


GCCGTCGCGG 


TGGCACCGTC 


GCCGCCGGTG 


CCACCGGTCG 


1080 


GCGTGCCGGC 


AGTGCCATGG 


CCGCCCGTGC 


CGCCGTCGCC 


GCCGGTTTGA 


TCACCGATGC 


1140 


CGGACACATC 


TGCCGGGCTG 


TCCCCGGTGC 


TGGCCGCGGG 


GCCGGGCGTG 


GGATTGACCC 


1200 


CGTTTGCCCC 


GGCGAGGCCG 


GCGCCGCCGG 


TACCACCGGC 


GCCGCCATGG 


CCGAACAGCC 


1260 


CGGCGTTGCC 


GCCGTTACCG 


CCCGCACCCC 


CGATGCCTGC 


GGCCACGCTG 


GTGCCGCCGA 


1320 


CACCGCCGTT 


GCCGCCGTTG 


CCCCACAACC 


ACCCCCCGTT 


CCCACCGGCA 


CCGCCGGCCG 


1380 


CGCCGGTACC 


ACCGGCCCCG 


CCGTTGCCGC 


CGTTGCCGAT 


CAACCCGGCC 


GCGCCTCCGC 


1440 


TGCCGCCGGT 


TTGACCGAAC 


CCGCCAGCCG 


CGGCGTTGCC 


ACCGTTGCCA 


AACAGCAACC 


1500 


CGCCGGCCGC 


GCCAGGCTGC 


CCGGGTGCCG 


TCCCGTCGGC 


GCCGTTTCCG 


ATCAACGGGC 


1560 


GCCCCAAAAG 


CGCCTCGGTG 


GGCGCATTCA 


CCGCACCCAG 


CAGACTCCGC 


TCAACAGCGG 


1620 


CTTCAGTGCT 


GGCATACCGA 


CCCGCGGCCG 


CAGTCAACGC 


CTGCACAAAC 


TGCTCGTGAA 


1680 


ACGCTGCCAC 


CTGTACGCTG 


AGCGCCTGAT 


ACTGCCGAGC 


ATGGGCCCCG 


AACAACCCCG 


1740 


CAATCGCCGC 


CGACACTTCA 


TCGGCAGCCG 


CAGCCACCAC 


TTCCGTCGTC 


GGGATCGCCG 


1800 


CGGCCGCATT 


AGCCGCGCTC 


ACCTGCGAAC 


CAATAGTCGA 


TAAATCCAAA 


GCCGCAGTTG 


1860 


CCAGCAGCTG 


CGGCGTCGCG 


ATCACCAAGG 


ACACCTCGCA 


CCTCCGGATA 


CCCCATATCG 


1920 


CCGCACCGTG 


TCCCCAGCGG 


CCACGTGACC 


TTTGGTCGCT 


GGCTGGCGGC 


CCTGACTATG 


1980 


GCCGCGACGG 


CCCTCGTTCT 


GATTCGCCCC 


GGCGCGCAGC 


TTGTTGCGCG 


AGTTGAAGAC 


2040 


GGGAGGACAG 


GCCGAGCTTG 


GTGTAGACGT 


GGGTCAAGTG 


GGAATGCACG 


GTCCGCGGCG 


2100 


AGATGAATAG 


GCGGACGCCG 


ATCTCCTTGT 


TGCTGAGTCC 


CTCACCGACC 


AGTAGAGCCA 


2160 


CCTCAAGCTC 


TGTCGGTGTC 


AACGCGCCCC 


AGCCACTTGT 


CGGGCGTTTC 


CGTGCACCGC 


2220 


GGCCTCGTTG 


CGCGTACGCG 


ATCGCCTCAT 


CGATCGATAA 


CGCAGTTCCT 


TCGGCCCAGG 


2280 


CATCGTCGAA 


CTCGCTGTCA 


CCCATGGATT 


TTCGAAGGGT 


GGCTAGCGAC 


GAGTTACAGC 


2340 


CCGCCTGGTA 


GATCCCGAAG 


CGGACCG 








2367 



(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Gin Pro Ala Gly Ala Thr lie Ala Ala Ser Ser Pro Cys Ala Thr Val 
15 10 15 

Gly Ala Gly Gly Gly Thr Gly Ser Pro Val Thr Thr Glu Thr Ala Ala 
20 25 30 

Thr Thr Gly Arg Gly Gly Ser Gly Asp Val Tyr Glu Ser Ala Ala Ser 
35 40 45 

Gly Ala Ala Ala Thr Thr Pro Thr Ala Gly Gly Tyr Thr Val Gly Pro 
50 55 60 

Val Ala Thr lie Thr Ala Lys Gly Ala Arg Asn Val Ala Leu Arg Asp 
65 70 75 80 

Ser Ala Val Ala Ala Val Ala Ala Ala Ala Thr Gly Ser Gly Gly Thr 
85 90 95 

Ala Val Thr Thr Gly Thr Ala Gly Gly Leu Ala Arg Ala Cys Arg Arg 
100 105 110 

Gly Gly Thr Val Ala Ala Gly Ala Thr Gly Arg Arg Ala Gly Ser Ala 
115 120 125 

Met Ala Ala Arg Ala Ala Val Ala Ala Gly Leu lie Thr Asp Ala Gly 
130 135 140 

His lie Cys Arg Ala Val Pro Gly Ala Gly Arg Gly Ala Gly Arg Gly 
145 150 155 160 

lie Asp Pro Val Cys Pro Gly Glu Ala Gly Ala Ala Gly Thr Thr Gly 
165 170 175 

Ala Ala Met Ala Glu Gin Pro Gly Val Ala Ala Val Thr Ala Arg Thr 
180 185 190 

Pro Asp Ala Cys Gly His Ala Gly Ala Ala Asp Thr Ala Val Ala Ala 
195 200 205 

Val Ala Pro Gin Pro Pro Pro Val Pro Thr Gly Thr Ala Gly Arg Ala 
210 215 220 

Gly Thr Thr Gly Pro Ala Val Ala Ala Val Ala Asp Gin Pro Gly Arg 
225 230 235 240 

Ala Ser Ala Ala Ala Gly Leu Thr Glu Pro Ala, Ser Arg Ala Val Ala 
245 250 255 

Thr Val Ala Lys Gin Gin Pro Ala Gly Arg Ala Arg Leu Pro Gly Cys 
260 265 270 

Arg Pro Val Gly Ala Val Ser Asp Gin Arg Ala Pro Gin Lys Arg Leu 
275 280 285 

Gly Gly Arg lie His Arg Thr Gin Gin Thr Pro Leu Asn Ser Gly Phe 
290 J 295 300 
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Ser Ala Gly lie Fro Thr Arg Gly Arq Ser Gin Arq Leu His Lys Leu 
305 310 315 320 

Leu Val Lys Arg Cys His Lou Tyr Ala Glu Arq Leu lie Leu Pro Ser 
325 330 335 

Met Gly Pro Glu Gin Pro Arg Asn Arg Arg Arg His Phe lie Gly Ser 
34 0 34 5 350 

Arq Ser His His Phe Arg Arg Arg Asp Arg Arg Gly Arg lie Ser Arg 
355 360 365 

Ala His Leu Arg Thr Asn Ser Arg 
37 0 37 5 

(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2852 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 



GGCCAAAACG 


CCCCGGCGAT 


CGCGGCCACC 


GAGGCCGCCT 


ACGACCAGAT 


GTGGGCCCAG 


60 


GACGTGGCGG 


CGATGTTTGG 


CTACCATGCC 


GGGGCTTCGG 


CGGCCGTCTC 


GGCGTTGACA 


120 


CCGTTCGGCC 


AGGCGCTGCC 


GACCGTGGCG 


GGCGGCGGTG 


CGCTGGTCAG 


CGCGGCCGCG 


180 


GCTCAGGTGA 


CCACGCGGGT 


CTTCCGCAAC 


CTGGGCTTGG 


CGAACGTCCG 


CGAGGGCAAC 


240 


GTCCGCAACG 


GTAATGTCCG 


GAACTTCAAT 


CTCGGCTCGG 


CCAACATCGG 


CAACGGCAAC 


300 


ATCGGCAGCG 


GCAACATCGG 


CAGCTCCAAC 


ATCGGGTTTG 


GCAACGTGGG 


TCCTGGGTTG 


360 


ACCGCAGCGC 


TGAACAACAT 


CGGTTTCGGC 


AACACCGGCA . 


GCAACAACAT 


CGGGTTTGGC 


420 


AACACCGGCA 


GCAACAACAT 


CGGGTTCGGC 


AATACCGGAG 


ACGGCAACCG 


AGGTATCGGG 


480 


CTCACGGGTA 


GCGGTTTGTT 


GGGGTTCGGC 


GGCCTGAACT 


CGGGCACCGG 


CAACATCGGT 


540 


CTGTTCAACT 


CGGGCACCGG 


AAACGTCGGC 


ATCGGCAACT 


CGGGTACCGG 


GAACTGGGGC 


600 


ATTGGCAACT 


CGGGCAACAG 


CTACAACACC 


GGTTTTGGCA 


ACTCCGGCGA 


CGCCAACACG 


660 


GGCTTCTTCA 


ACTCCGGAAT 


AGCCAACACC 


GGCGTCGGCA 


ACGCCGGCAA 


CTACAACACC 


720 


GGTAGCTACA 


ACCCGGGCAA 


CAGCAATACC 


GGCGGCTTCA 


ACATGGGCCA 


GTACAACACG 


780 


GGCTACCTGA 


ACAGCGGCAA 


CTACAACACC 


GGCTTGGCAA 


ACTCCGGCAA 


TGTCAACACC 


840 


GGCGCCTTCA 


TTACTGGCAA 


CTTCAACAAC 


GGCTTCTTGT 


GGCGCGGCGA 


CCACCAAGGC 


900 


CTGATTTTCG 


GGAGCCCCGG 


CTTCTTCAAC 


TCGACCAGTG 


CGCCGTCGTC 


GGGATTCTTC 


960 
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AACAGCGGTG CCGGTAGCGC GTCCGGCTTC CTGAACTCCG GTGCCAACAA TTCTGGCTTC 1020 

TTCAACTCTT CGTCGGGGGC CATCGGTAAC TCCGGCCTGG CAAACGCGGG CGTGCTGGTA 108 0 

TCGGGCGTGA TCAACTCGGG CAACACCGTA TCGGGTTTGT TCAACATGAG CCTGGTGGCC 114 0 

ATCACAACGC CGGCCTTGAT CTCGGGCTTC TTCAACACCG GAAGCAACAT GTCGGGATTT 1200 

TTCGGTGGCC CACCGGTCTT CAATCTCGGC CTGGCAAACC GGGGCGTCGT GAACATTCTC 12 60 

GGCAACGCCA ACATCGGCAA TTACAACATT CTCGGCAGCG GAAACGTCGG TGACTTCAAC 1320 

ATCCTTGGCA GCGGCAACCT CGGCAGCCAA AACATCTTGG GCAGCGGCAA CGTCGGCAGC 138 0 

TTCAATATCG GCAGTGGAAA CATCGGAGTA TTCAATGTCG GTTCCGGAAG CCTGGGAAAC 14 4 0 

TACAACATCG GATCCGGAAA CCTCGGGATC T AC AAC AT C G GTTTTGGAAA CGTCGGCGAC 1500 

TACAACGTCG GCTTCGGGAA CGCGGGCGAC TTCAACCAAG GCTTTGCCAA CACCGGCAAC 15 60 

AAC AAC AT C G GGTTCGCCAA CACCGGCAAC AAC AAC AT C G GCATCGGGCT GTCCGGCGAC 162 0 

AACCAGCAGG GCTTCAATAT TGCTAGCGGC TGGAACTCGG GCACCGGCAA CAGCGGCCTG 168 0 

TTCAATTCGG GCACCAATAA CGTTGGCATC TTCAACGCGG GCACCGGAAA CGTCGGCATC 17 4 0 

GCAAACTCGG GCACCGGGAA CTGGGGTATC GGGAACCCGG GTACCGACAA TACCGGCATC 1800 

CTCAATGCTG GCAGCTACAA CACGGGCATC CTCAACGCCG GCGACTTCAA CACGGGCTTC 18 60 

TACAACACGG GCAGCTACAA CACCGGCGGC TTCAACGTCG GTAACACCAA CACCGGCAAC 1920 

TTCAACGTGG GTGACACCAA TACCGGCAGC TATAACCCGG GTGACACCAA CACCGGCTTC 198 0 

TTCAATCCCG GCAACGTCAA TACCGGCGCT TTCGACACGG GCGACTTCAA CAATGGCTTC 204 0 

TTGGTGGCGG GCGATAACCA GGGCCAGATT GCCATCGATC TCTCGGTCAC CACTCCATTC 2100 

ATCCCCATAA AC G AG C AG AT GGTCATTGAC GTACACAACG TAATGACCTT CGGCGGCAAC 2160 

ATGATCACGG TCACCGAGGC CTCGACCGTT TTCCCCCAAA CCTTCTATCT GAGCGGTTTG 2220 

TTCTTCTTCG GCCCGGTCAA TCTCAGCGCA TCCACGCTGA CCGTTCCGAC GATCACCCTC 2280 

ACCATCGGCG GACCGACGGT GACCGTCCCC ATCAGCATTG TCGGTGCTCT GGAGAGCCGC 234 0 

ACGATTACCT TCCTCAAGAT CGATCCGGCG CCGGGCATCG GAAATTCGAC CACCAACCCC 24 00 

TCGTCCGGCT TCTTCAACTC GGGCACCGGT GGCACATCTG GCTTCCAAAA CGTCGGCGGC 24 60 

GGCAGTTCAG GCGTCTGGAA CAGTGGTTTG AGCAGCGCGA TAGGGAATTC GGGTTTCCAG 2520 

AACCTCGGCT CGCTGCAGTC AGGCTGGGCG AACCTGGGCA ACTCCGTATC GGGCTTTTTC 2580 

AAC AC C AG T A CGGTGAACCT CTCCACGCCG GCCAATGTCT CGGGCCTGAA CAACATCGGC 2 64 0 

ACCAACCTGT CCGGCGTGTT CCGCGGTCCG ACCGGGACGA TTTTCAACGC GGGCCTTGCC 27 00 

AACCTGGGCC AG T T G AAC AT CGGCAGCGCC TCGTGCCGAA TTCGGCACGA GTTAGATACG 27 60 

GTTTCAACAA TCATATCCGC GTTTTGCGGC AGTGCATCAG ACGAATCGAA CCCGGGAAGC 2 820 
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GTAAGCGAAT AAACCGAATG GCGGCCTGTC AT 2 0 52 

(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 943 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Gly Gin Asn Ala Pro Ala lie Ala Ala Thr Glu Ala Ala Tyr Asp Gin 
15 10 15 

Met Trp Ala Gin Asp Val Ala Ala Met Phe Gly Tyr His Ala Gly Ala 
20 25 - 30 

Ser Ala Ala Val Ser Ala Leu Thr Pro Phe Gly Gin Ala Leu Pro Thr 
35 40 45 

Val Ala Gly Gly Gly Ala Leu Val Ser Ala Ala Ala Ala Gin Val Thr 
50 55 60 

Thr Arg Val Phe Arg Asn Leu Gly Leu Ala Asn Val Arg Glu Gly Asn 
65 70 75 " 80 

Val Arg Asn Gly Asn Val Arg Asn Phe Asn Leu Gly Ser Ala Asn lie 
8 5 90 95 

Gly Asn Gly Asn He Gly Ser Gly Asn He Gly Ser Ser Asn He Gly 
100 105 110 

Phe Gly Asn Val Gly Pro Gly Leu Thr Ala Ala Leu Asn Asn He Gly 
115 120 125 

Phe Gly Asn Thr Gly Ser Asn Asn lie Gly Phe Gly Asn Thr Gly Ser 
130 135 140 

Asn Asn He Gly Phe Gly Asn Thr Gly Asp Gly Asn Arg Gly lie Gly 
145 150 155 160 

Leu Thr Gly Ser Gly Leu Leu Gly Phe Gly Gly Leu Asn Ser Gly Thr 
165 170 175 

Gly Asn He Gly Leu Phe Asn Ser Gly Thr Gly Asn Val Gly lie Gly 
180 185 190 

Asn Ser Gly Thr Gly Asn Trp Gly lie Gly Asn Ser Gly Asn Ser Tyr 
195 200 205 

Asn Thr Gly Phe Gly Asn Ser Gly Asp Ala Asn Thr Gly Phe Phe Asn 
210 215 220 

Ser Gly He Ala Asn Thr Gly Val Gly Asn Ala Gly Asn Tyr Asn Thr 
225 230 235 " 240 

Gly Ser Tyr Asn Pro Gly Asn Ser Asn Thr Gly Gly Phe Asn Met Gly 
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245 250 255 

Gin Tyr Asn Thr Gly Tyr Leu Asn Ser Gly Asn Tyr Asn Thr Gly Leu 
2 60 2 65 270 

Ala Asn Ser Gly Asn Val Asn Thr Gly Ala Phe He Thr Gly Asn Phe 
275 280 285 

Asn Asn Gly Phe Leu Trp Arg Gly Asp His Gin Gly Leu He Phe Gly 
290 295 300 

Ser Pro Gly Phe Phe Asn Ser Thr Ser Ala Pro Ser Ser Gly Phe Phe 
305 310 315 320 

Asn Ser Gly Ala Gly Ser Ala Ser Gly Phe Leu Asn Ser Gly Ala Asn 
325 330 335 

Asn Ser Gly Phe Phe Asn Ser Ser Ser Gly Ala He Gly Asn Ser Gly 
340 345 350 

Leu Ala Asn Ala Gly Val Leu Val Ser Gly Val He Asn Ser Gly Asn 
355 360 365 

Thr Val Ser Gly Leu Phe Asn Met Ser Leu Val Ala He Thr Thr Pro 
370 ' 375 380 

Ala Leu He Ser Gly Phe Phe Asn Thr Gly Ser Asn Met Ser Gly Phe 
385 390 395 400 

Phe Gly Gly Pro Pro Val Phe Asn Leu Gly Leu Ala Asn Arg Gly Val 
405 410 415 

Val Asn He Leu Gly Asn Ala Asn He Gly Asn Tyr Asn He Leu Gly 
420 425 430 

Ser Gly Asn Val Gly Asp Phe Asn He Leu Gly Ser Gly Asn Leu Gly 
4 35 44 0 44 5 

Ser Gin Asn He Leu Gly Ser Gly Asn Val Gly Ser Phe Asn He Gly 
450 455 460 

Ser Gly Asn He Gly Val Phe Asn Val Gly Ser Gly Ser Leu Gly Asn 
465 470 475 480 

Tyr Asn He Gly Ser Gly Asn Leu Gly He Tyr Asn He Gly Phe Gly 
485 490 495 

-Asn Val Gly Asp Tyr Asn Val Gly Phe Gly Asn Ala Gly Asp Phe Asn 
500 505 510 

Gin Gly Phe Ala Asn Thr Gly Asn Asn Asn He Gly Phe Ala Asn Thr 
515 520 525 

Gly Asn Asn Asn He Gly He Gly Leu Ser Gly Asp Asn Gin Gin Gly 
530 535 540 

Phe Asn He Ala Ser Gly Trp Asn Ser Gly Thr Gly Asn Ser Gly Leu 
545 550 555 560 

Phe Asn Ser Gly Thr Asn Asn Val Gly He Phe Asn Ala Gly Thr Gly 
565 570 575 



BNSDOCID: <WO 9816646A2_I_> 



WO 98/16646 



PCTAJS97/18293 



201 



Asn Val Gly lie Ala Asn Ser Gly Thr Gly Asn Trp Gly He Gly Asn 
580 585 " 590 

Pro Gly Thr Asp Asn Thr Gly He Leu Asn Ala Gly Ser Tyr Asn Thr 
595 600 605 

Gly He Leu Asn Ala Gly Asp Phe Asn Thr Gly Phe Tyr Asn Thr Gly 
610 615 " 620 

Ser Tyr Asn Thr Gly Gly Phe Asn Val Gly Asn Thr Asn Thr Gly Asn 
625 630 635 640 

Phe Asn Val Gly Asp Thr Asn Thr Gly Ser Tyr Asn Pro Gly Asp Thr 
645 650 655 

Asn Thr Gly Phe Phe Asn Pro Gly Asn Val Asn Thr Gly Ala Phe Asp 
660 665 670 

Thr Gly Asp Phe Asn Asn Gly Phe Leu Val Ala Gly Asp Asn Gin Gly 
675 680 685 

Gin He Ala He Asp Leu Ser Val Thr Thr Pro Phe He Pro He Asn 
690 695 700 

Glu Gin Met Val lie Asp Val His Asn Val Met Thr Phe Gly Gly Asn 
705 710 715 720 

Met He Thr Val Thr Glu Ala Ser Thr Val Phe Pro Gin Thr Phe Tyr 
725 730 735 

Leu Ser Gly Leu Phe Phe Phe Gly Pro Val Asn Leu Ser Ala Ser Thr 
740 745 750 

Leu Thr Val Pro Thr He Thr Leu Thr He Gly Gly Pro Thr Val Thr 
755 760 765 

Val Pro He Ser He Val Gly Ala Leu Glu Ser Arg Thr He Thr Phe 
770 775 780 

Leu Lys He Asp Pro Ala Pro Gly lie Gly Asn Ser Thr Thr Asn Pro 
785 790 795 800 

Ser Ser Gly Phe Phe Asn Ser Gly Thr Gly Gly Thr Ser Gly Phe Gin 
805 810 815 

Asn Val Gly Gly Gly Ser Ser Gly Val Trp Asn Ser Gly Leu Ser Ser 
820 825 " 830 

Ala lie Gly Asn Ser Gly Phe Gin Asn Leu Gly Ser Leu Gin Ser Gly 
835 840 845 

Trp Ala Asn Leu Gly Asn Ser Val Ser Gly Phe Phe Asn Thr Ser Thr 
850 855 860 

Val Asn Leu Ser Thr Pro Ala Asn Val Ser Gly Leu Asn Asn lie Gly 
865 870 875 880 

Thr Asn Leu Ser Gly Val Phe Arg Gly Pro Thr Gly Thr lie Phe Asn 
885 " ~ 890 895 

Ala Gly Leu Ala Asn Leu Gly Gin Leu Asn He Gly Ser Ala Ser Cys 
900 905 910 
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Ara He Arq His Glu Leu Asp Thr Val Ser Thr He He Ser Ala Phe 
915 920 925 

Cvs Gly Ser Ala Ser Asp Glu Ser Asn Pro Gly Ser Val Ser Glu 
930 935 940 

(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 
GGATCCATAT GGGCCATCAT CAT CAT CATC ACGTGATCGA CATCATCGGG ACC 
(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 
CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 
(2*)" INFORMATION FOR SEQ ID NO: 207: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 
(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 
CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 31 
(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:209: 
GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 33 
(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:210: 
GGATATCTGC AGAATTCAGG TTTAAAGCCC ATTTGCGA 38 
(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 
CCGCATGCGA GCCACGTGCC CACAACGGCC 30 
(2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 
CTTCATGGAA TTCTCAGGCC GGTAAGGTCC GCTGCGG 
(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7676 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:213: 



TGGCGAATGG 


GACGCGCCCT 


GTAGCGGCGC 


ATTAAGCGCG 


GCGGGTGTGG 


TGGTTACGCG 


60 


CAGCGTGACC 


GCTACACTTG 


CCAGCGCCCT 


AGCGCCCGCT 


CCTTTCGCTT 


TCTTCCCTTP 


x t U 


CTTTCTCGCC 


ACGTTCGCCG 


GCTTTCCCCG 


TCAAGCTCTA 


AATCGGGGGC 


TCCCTTTAGG 


180 


GTTCCGATTT 


AGTGCTTTAC 


GGCACCTCGA 


CCCCAAAAAA 


CTTGATTAGG 


GTGATGGTTC 


240 


ACGTAGTGGG 


CCATCGCCCT 


GATAGACGGT 


TTTTCGCCCT 


TTGACGTTGG 


AGTCCACGTT 


300 


CTTTAATAGT 


GGACTCTTGT 


TCCAAACTGG 


AACAACACTC 


AACCCTATCT 


CGGTCTATTC 


360 


TTTTGATTTA 


TAAGGGATTT 


TGCCGATTTC 


GGCCTATTGG 


TTAAAAAATG 


AGCTGATTTA 


420 


ACAAAAATTT 


AACGCGAATT 


TTAACAAAAT 


ATTAACGTTT 


ACAATTTCAG 


GTGGCACTTT 


480 


TCGGGGAAAT 


GTGCGCGGAA 


CCCCTATTTG 


TTTATTTTTC 


T AAAT AC ATT 


CAAATATGTA 


540 


TCCGCTCATG 


AATTAATTCT 


TAGAAAAACT 


CATCGAGCAT 


CAAATGAAAC 


TGCAATTTAT 


600 


TCATATCAGG 


ATTATCAATA 


CCATATTTTT 


GAAAAAGCCG 


TTTCTGTAAT 


GAAGGAGAAA 


660 


ACTCACCGAG 


GCAGTTCCAT 


AGGATGGCAA 


GATCCTGGTA 


TCGGTCTGCG 


ATTCCGACTC 


720 


GTCCAACATC 


AATACAACCT 


ATTAATTTCC 


CCTCGTCAAA 


AATAAGGTTA 


TCAAGTGAGA 


780 


AATCACCATG 


AGTGACGACT 


GAATCCGGTG 


AGAATGGCAA 


AAGTTTATGC 


ATTTCTTTCC 


840 


AGACTTGTTC 


AACAGGCCAG 


CCATTACGCT 


CGTCATCAAA 


ATCACTCGCA 


TCAACCAAAC 


900 


CGTTATTCAT 


TCGTGATTGC 


GCCTGAGCGA 


G AC G AAAT AC 


GCGATCGCTG 


TTAAAAGGAC 


960 


AATTACAAAC 


AGGAATCGAA 


TGCAACCGGC 


GCAGGAACAC 


TGCCAGCGCA 


TCAACAATAT 


1020 


TTTCACCTGA 


AT CAGG ATAT 


TCTTCTAATA 


CCTGGAATGC 


TGTTTTCCCG 


GGGATCGCAG 


1080 


TGGTGAGTAA 


CCATGCATCA 


TCAGGAGTAC 


GGATAAAATG 


CTTGATGGTC 


GGAAGAGGCA 


1140 
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TAAATTCCGT 


CAGCCAGTTT 


AGTCTGACCA 


TCTCATCTGT 


AACATCATTG 


GCAACGCTAC 


1200 


CTTTGCCATG 


TTTCAGAAAC 


AACTCTGGCG 


CATCGGGCTT 


CCCATACAAT 


CGATAGATTG 


1260 


TCGCACCTGA 


TTGCCCGACA 


TTATCGCGAG 


CCCATTTATA 


CCCATATAAA 


TCAGCATCCA 


1320 


TGTTGGAATT 


TAATCGCGGC 


CTAGAGCAAG 


ACGTTTCCCG 


TTGAATATGG 


CTCATAACAC 


1380 


CCCTTGTATT 


ACTGTTTATG 


TAAGCAGACA 


GTTTTATTGT 


TCATGACCAA 


AATCCCTTAA 


1440 


CGTGAGTTTT 


CGTTCCACTG 


AGCGTCAGAC 


CCCGTAGAAA 


AGATCAAAGG 


ATCTTCTTGA 


1500 


GATCCTTTTT 


TTCTGCGCGT 


AATCTGCTGC 


TTGCAAACAA 


AAAAACCACC 


GCTACCAGCG 


1560 


GTGGTTTGTT 


TGCCGGATCA 


AGAGCTACCA 


ACTCTTTTTC 


CGAAGGTAAC 


TGGCTTCAGC 


1620 


AGAGCGCAGA 


TACCAAATAC 


TGTCCTTCTA 


GTGTAGCCGT 


AGTTAGGCCA 


CCACTTCAAG 


1680 


AACTCTGTAG 


CACCGCCTAC 


ATACCTCGCT 


CTGCTAATCC 


TGTTACCAGT 


GGCTGCTGCC 


1740 


AGTGGCGATA 


AGTCGTGTCT 


TACCGGGTTG 


GACTCAAGAC 


GATAGTTACC 


GGATAAGGCG 


1800 


CAGCGGTCGG 


GC.TGAACGGG 


GGGTTCGTGC 


ACACAGCCCA 


GCTTGGAGCG 


AACGACCTAC 


1860 


ACCGAACTGA 


GATACCTACA 


GCGTGAGCTA 


TGAGAAAGCG 


CCACGCTTCC 


CGAAGGGAGA 


1920 


AAGGCGGACA 


GGTATCCGGT 


AAGCGGCAGG 


GTCGGAACAG 


GAGAGCGCAC 


GAGGGAGCTT 


1980 


CCAGGGGGAA 


ACGCCTGGTA 


TCTTTATAGT 


CCTGTCGGGT 


TTCGCCACCT 


CTGACTTGAG 


2040 


CGTCGATTTT 


TGTGATGCTC 


GTCAGGGGGG 


CGGAGCCTAT 


GGAAAAACGC 


CAGCAACGCG 


2100 


GCCTTTTTAC 


GGTTCCTGGC 


CTTTTGCTGG 


CCTTTTGCTC 


ACATGTTCTT 


TCCTGCGTTA 


2160 


TCCCCTGATT 


CTGTGGATAA 


CCGTATTACC 


GCCTTTGAGT 


GAGCTGATAC 


CGCTCGCCGC 


2220 


AGCCGAACGA 


CCGAGCGCAG 


CGAGTCAGTG 


AGCGAGGAAG 


CGGAAGAGCG 


CCTGATGCGG 


2280 


TATTTTCTCC 


TTACGCATCT 


GTGCGGTATT 


TCACACCGCA 


TATATGGTGC 


ACTCTCAGTA 


2340 


CAATCTGCTC 


TGATGCCGCA 


TAGTTAAGCC 


AG TAT AC ACT 


CCGCTATCGC 


TACGTGACTG 


2400 


GGTCATGGCT 


GCGCCCCGAC 


ACCCGCCAAC 


ACCCGCTGAC 


GCGCCCTGAC 


GGGCTTGTCT 


2460 


GCTCCCGGCA 


TCCGCTTACA 


GACAAGCTGT 


GACCGTCTCC 


GGGAGCTGCA 


TGTGTCAGAG 


2520 


GTTTTCACCG 


TCATCACCGA 


AACGCGCGAG 


GCAGCTGCGG 


TAAAGCTCAT 


CAGCGTGGTC 


2580 


GTGAAGCGAT 


TCACAGATGT 


CTGCCTGTTC 


ATCCGCGTCC 


AGCTCGTTGA 


GTTTCTCCAG 


2640 


AAGCGTTAAT 


GTCTGGCTTC 


TGATAAAGCG 


GGCCATGTTA 


AGGGCGGTTT 


TTTCCTGTTT 


2700 


GGTCACTGAT 


GCCTCCGTGT 


AAGGGGGATT 


TCTGTTCATG 


GGGGTAATGA 


TACCGATGAA 


2760 


ACGAGAGAGG 


ATGCTCACGA 


TACGGGTTAC 


TGATGATGAA 


CATGCCCGGT 


TACTGGAACG 


2820 


TTGTGAGGGT 


AAACAACTGG 


CGGTATGGAT 


GCGGCGGGAC 


CAGAGAAAAA 


TCACTCAGGG 


2880 


TCAATGCCAG 


CGCTTCGTTA 


ATACAGATGT 


AGGTGTTCCA 


CAGGGTAGCC 


AGCAGCATCC 


2940 


TGCGATGCAG 


ATCCGGAACA 


TAATGGTGCA 


GGGCGCTGAC 


TTCCGCGTTT 


CCAGACTTTA 


3000 
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CGAAACACGG AAACCGAAGA CCATTCATGT TGTTGCTCAG GTCGCAGACG TTTTGCAGCA 30 60 

GCAGTCGCTT CACGTTCGCT CGCGTATCGG TGATTCATTC TGCTAACCAG TAAGGCAACC 312 0 

CCGCCAGCCT AGCCGGGTCC TCAACGACAG GAGCACGATC ATGCGCACCC GTGGGGCCGC 318 0 

CATGCCGGCG ATAATGGCCT GCTTCTCGCC GAAACGTTTG GTGGCGGGAC CAGTGACGAA 324 0 

GGCTTGAGCG AGGGCGTGCA AGATTCCGAA TACCGCAAGC GACAGGCCGA TCATCGTCGC 3300 

GCTCCAGCGA AAGCGGTCCT CGCCGAAAAT GACCCAGAGC GCTGCCGGCA CCTGTCCTAC 3360 

GAGTTGCATG ATAAAGAAGA CAGTCATAAG TGCGGCGACG ATAGTCATGC CCCGCGCCCA 34 20 

CCGGAAGGAG CTGACTGGGT TGAAGGCTCT CAAGGGCATC GGTCGAGATC CCGGTGCCTA 34 80 

ATGAGTGAGC TAACTTACAT TAATTGCGTT GCGCTCACTG CCCGCTTTCC AGTCGGGAAA 35 4 0 

CCTGTCGTGC CAGCTGCATT AATGAATCGG CCAACGCGCG GGGAGAGGCG GTTTGCGTAT 3600 

TGGGCGCCAG GGTGGTTTTT CTTTTCACCA GTGAGACGGG CAACAGCTGA TTGCCCTTCA 3660 

CCGCCTGGCC CTGAGAGAGT TGCAGCAAGC GGTCCACGCT GGTTTGCCCC AGCAGGCGAA 37 20 

AATCCTGTTT GATGGTGGTT AACGGCGGGA TATAACATGA GCTGTCTTCG GTATCGTCGT 37 8 0 

ATCCCACTAC CGAGATATCC GCACCAACGC GCAGCCCGGA CTCGGTAATG GCGCGCATTG 384 0 

CGCCCAGCGC CATCTGATCG TTGGCAACCA GCATCGCAGT GGGAACGATG CCCTCATTCA 3 900 

GCATTTGCAT GGTTTGTTGA AAACCGGACA TGGCACTCCA GTCGCCTTCC CGTTCCGCTA 3960 

TCGGCTGAAT TTGATTGCGA GTGAGATATT TATGCCAGCC AGCCAGACGC AGACGCGCCG 4 020 

AGACAGAACT TAATGGGCCC GCTAACAGCG CGATTTGCTG GTGACCCAAT GCGACCAGAT 4 080 

GCTCCACGCC CAGTCGCGTA CCGTCTTCAT GGGAGAAAAT AATACTGTTG ATGGGTGTCT 414 0 

GGTCAGAGAC ATCAAGAAAT AACGCCGGAA CATTAGTGCA GGCAGCTTCC ACAGCAATGG 4 200 

CATCCTGGTC ATCCAGCGGA TAGTTAATGA TCAGCCCACT GACGCGTTGC GCGAGAAGAT 4 2 60 

TGTGCACCGC CGCTTTACAG GCTTCGACGC CGCTTCGTTC TACCATCGAC ACCACCACGC 4 320 

TGGCACCCAG TTGATCGGCG CGAGATTTAA TCGCCGCGAC AATTTGCGAC GGCGCGTGCA 4 38 0 

GGGCCAGACT GGAGGTGGCA ACGCCAATCA GCAACGACTG TTTGCCCGCC AGTTGTTGTG 4 44 0 

CCACGCGGTT GGGAATGTAA TTCAGCTCCG CCATCGCCGC TTCCACTTTT TCCCGCGTTT 4 500 

TCGCAGAAAC GTGGCTGGCC TGGTTCACCA CGCGGGAAAC GGTCTGATAA GAGACACCGG 4 560 

CATACTCTGC GACATCGTAT AACGTTACTG GTTTCACATT CACCACCCTG AATTGACTCT 4 620 

CTTCCGGGCG CTATCATGCC ATACCGCGAA AGGTTTTGCG CCATTCGATG GTGTCCGGGA 4 68 0 

TCTCGACGCT CTCCCTTATG CGACTCCTGC AT TAG G AAG C AGCCCAGTAG TAGGTTGAGG 474 0 

CCGTTGAGCA CCGCCGCCGC AAGGAATGGT GCATGCAAGG AGATGGCGCC CAACAGTCCC 4 800 

CCGGCCACGG GGCCTGCCAC CATACCCACG CCGAAACAAG CGCTCATGAG CCCGAAGTGG 4 8 60 
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CGAGCCCGAT 


CTTCCCCATC 


GGTGATGTCG 


GCGATATAGG 


CGCCAGCAAC 


CGCACCTGTG 


4920 


GCGCCGGTGA 


TGCCGGCCAC 


GATGCGTCCG 


GCGTAGAGGA 


TCGAGATCTC 


GATCCCGCGA 


4980 


AATTAATACG 


ACTCACTATA 


GGGGAATTGT 


GAGCGGATAA 


CAATTCCCCT 


CTAGAAATAA 


5040 


TTTTGTTTAA 


CTTTAAGAAG 


GAGATATACA 


TATGGGCCAT 


CATCATCATC 


ATCACGTGAT 


5100 


CGACATCATC 


GGGACCAGCC 


CCACATCCTG 


GGAACAGGCG 


GCGGCGGAGG 


CGGTCCAGCG 


5160 


GGCGCGGGAT 


AGCGTCGATG 


ACATCCGCGT 


CGCTCGGGTC 


ATTGAGCAGG 


ACATGGCCGT 


5220 


GGACAGCGCC 


GGCAAGATCA 


CCTACCGCAT 


CAAGCTCGAA 


GTGTCGTTCA 


AGATGAGGCC 


5280 


GGCGCAACCG 


AGGGGCTCGA 


AACCACCGAG 


CGGTTCGCCT 


GAAACGGGCG 


CCGGCGCCGG 


5340 


TACTGTCGCG 


ACTACCCCCG 


CGTCGTCGCC 


GGTGACGTTG 


GCGGAGACCG 


GTAGCACGCT 


5400 


GCTCTACCCG 


CTGTTCAACC 


TGTGGGGTCC 


GGCCTTTCAC 


GAGAGGTATC 


CGAACGTCAC 


5460 


GATCACCGCT 


CAGGGCACCG 


GTTCTGGTGC 


CGGGATCGCG 


CAGGCCGCCG 


CCGGGACGGT 


5520 


CAACATTGGG 


GCCTCCGACG 


CCTATCTGTC 


GGAAGGTGAT 


ATGGCCGCGC 


ACAAGGGGCT 


5580 


GATGAACATC 


GCGCTAGCCA 


TCTCCGCTCA 


GCAGGTCAAC 


T AC AACCTG C 


CCGGAGTGAG 


5640 


CGAGCACCTC 


AAGCTGAACG 


GAAAAGTCCT 


GGCGGCCATG 


TACCAGGGCA 


CCATCAAAAC 


5700 


CTGGGACGAC 


CCGCAGATCG 


CTGCGCTCAA 


CCCCGGCGTG 


AACCTGCCCG 


GCACCGCGGT 


5760 


AGTTCCGCTG 


CACCGCTCCG 


ACGGGTCCGG 


TGACACCTTC 


TTGTTCACCC 


AGTACCTGTC 


5820 


CAAGCAAGAT 


CCCGAGGGCT 


GGGGCAAGTC 


GCCCGGCTTC 


GGCACCACCG 


TCGACTTCCC 


5880 


GGCGGTGCCG 


GGTGCGCTGG 


GTGAGAACGG 


CAACGGCGGC 


ATGGTGACCG 


GTTGCGCCGA 


5940 


GACACCGGGC 


TGCGTGGCCT 


ATATCGGCAT 


CAGCTTCCTC 


GACCAGGCCA 


GTCAACGGGG 


6000 


ACTCGGCGAG 


GCCCAACTAG 


GCAATAGCTC 


TGGCAATTTC 


TTGTTGCCCG 


ACGCGCAAAG 


6060 


CATTCAGGCC 


GCGGCGGCTG 


GCTTCGCATC 


GAAAACCCCG 


GCGAACCAGG 


CGATTTCGAT 


6120 


GATCGACGGG 


CCCGCCCCGG 


ACGGCTACCC 


GATCATCAAC 


TACGAGTACG 


CCATCGTCAA 


6180 


CAACCGGCAA 


AAGGACGCCG 


CCACCGCGCA 


GACCTTGCAG 


GCATTTCTGC 


ACTGGGCGAT 


6240 


CACCGACGGC 


AACAAGGCCT 


CGTTCCTCGA 


CCAGGTTCAT 


TTCCAGCCGC 


TGCCGCCCGC 


6300 


GGTGGTGAAG 


TTGTCTGACG 


CGTTGATCGC 


GACGATTTCC 


AGCGCTGAGA 


TGAAGACCGA 


6360 


TGCCGCTACC 


CTCGCGCAGG 


AG G C AG G T AA 


TTTCGAGCGG 


ATCTCCGGCG 


ACCTGAAAAC 


6420 


CCAGATCGAC 


CAGGTGGAGT 


CGACGGCAGG 


TTCGTTGCAG 


GGCCAGTGGC 


GCGGCGCGGC 


6480 


GGGGACGGCC 


GCCCAGGCCG 


CGGTGGTGCG 


CTTCCAAGAA 


GCAGCCAATA 


AG C AG AAG C A 


6540 


GGAACTCGAC 


GAGATCTCGA 


CGAATATTCG 


TCAGGCCGGC 


GTCCAATACT 


CGAGGGCCGA 


6600 


CGAGGAGCAG 


CAGCAGGCGC 


TGTCCTCGCA 


AATGGGCTTT 


GTGCCCACAA 


CGGCCGCCTC 


6660 


GCCGCCGTCG 


ACCGCTGCAG 


CGCCACCCGC 


ACCGGCGACA 


CCTGTTGCCC 


CCCCACCACC 


6720 
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GGCCGCCGCC 


AACACGCCGA 


ATGCCCAGCC 


GGGCGATCCC 


AACGCAGCAC 


CTCCGCCGGC 


6780 


CGACCCGAAC 


GCACCGCCGC 


CACCTGTCAT 


TGCCCCAAAC 


GCACCCCAAC 


CTGTCCGGAT 


6840 


CGACAACCCG 


GTTGGAGGAT 


TCAGCTTCGC 


GCTGCCTGCT 


GGCTGGGTGG 


AGTCTGACGC 


6900 


CGCCCACTTC 


GACTACGGTT 


CAGCACTCCT 


CAGCAAAACC 


ACCGGGGACC 


CGCCATTTCC 


6960 


CGGACAGCCG 


CCGCCGGTGG 


CCAATGACAC 


CCGTATCGTG 


CTCGGCCGGC 


TAGACCAAAA 


7020 


GCTTTACGCC 


AGCGCCGAAG 


CCACCGACTC 


CAAGGCCGCG 


GCCCGGTTGG 


GCTCGGACAT 


7080 


GGGTGAGTTC 


TATATGCCCT 


ACCCGGGCAC 


CCGGATCAAC 


CAGGAAACCG 


TCTCGCTTGA 


7140 


CGCCAACGGG 


GTGTCTGGAA 


GCGCGTCGTA 


TTACGAAGTC 


AAGTTCAGCG 


ATCCGAGTAA 


7200 


GCCGAACGGC 


CAGATCTGGA 


CGGGCGTAAT 


CGGCTCGCCC 


GCGGCGAACG 


CACCGGACGC 


7260 


CGGGCCCCCT 


CAGCGCTGGT 


TTGTGGTATG 


GCTCGGGACC 


GCCAACAACC 


CGGTGGACAA 


7320 


GGGCGCGGCC 


AAGGCGCTGG 


CCGAATCGAT 


CCGGCCTTTG 


GTCGCCCCGC 


CGCCGGCGCC 


7380 


GGCACCGGCT 


CCTGCAGAGC 


CCGCTCCGGC 


GCCGGCGCCG 


GCCGGGGAAG 


TCGCTCCTAC 


7440 


CCCGACGACA 


CCGACACCGC 


AGCGGACCTT 


ACCGGCCTGA 


GAATTCTGCA 


GATATCCATC 


7500 


ACACTGGCGG 


CCGCTCGAGC 


ACCACCACCA 


CCACCACTGA 


GATCCGGCTG 


CTAACAAAGC 


7560 


CCGAAAGGAA 


GCTGAGTTGG 


CTGCTGCCAC 


CGCTGAGCAA 


TAACTAGCAT 


AACCCCTTGG 


7620 


GGCCTCTAAA 


CGGGTCTTGA 


GGGGTTTTTT 


GCTGAAAGGA 


GGAACTATAT 


CCGGAT 


7676 



(2) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 802 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 

Met Gly His His His His His His Val lie Asp He He Gly Thr Ser 
15 10 15 

Pro Thr Ser Trp Glu Gin Ala Ala Ala Glu Ala Val Gin Arg Ala Arg 
20 25 30 

Asp Ser Val Asp Asp He Arg Val Ala Arg Val He Glu Gin Asp Met 
35 40 4 5 

Ala Val Asp Ser Ala Gly Lys He Thr Tyr Arg He Lys Leu Glu Val 
50 55 60 

Ser Phe Lys Met Arg Pro Ala Gin Pro Arg Gly Ser Lys Pro Pro Ser 
65 70 75 80 
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Gly Ser Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro 
85 90 95 

Ala Ser Ser Pro Val Thr Leu Ala Glu Thr Glv Ser Thr Leu Leu Tyr 
100 105 ' HO 

Pro Leu Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn 
115 120 125 

Val Thr He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin 
130 135 140 

Ala Ala Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser 
115 150 155 160 

Glu Gly Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala 
165 170 175 

He Ser Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His 
180 185 190 

Leu Lys Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He 
195 " * 200 205 

Lys Thr Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn 
210 215 220 

Leu Pro Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly 
225 230 235 240 

Asp Thr Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly 
24 5 250 255 

Trp Gly Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val 
260 265 270 

Pro Gly Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys 
275 280 285 

Ala Glu Thr Pro Gly Cys Val Ala Tyr He Gly lie Ser Phe Leu Asp 
290 295 300 

Gin Ala Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser 
305 310 315 320 

Gly Asn Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala 
325 330 335 

Gly Phe Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp 
340 345 350 

Gly Pro Ala Pro Asp Gly Tyr Pro He lie Asn Tyr Glu Tyr Ala He 
355 360 365 

Val Asn Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala 
370 375 380 

Phe Leu His Trp Ala lie Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp 
385 390 395 400 

Gin Val His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp 
405 410 " 415 
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Ala Leu lie Ala Thr He Ser Ser Ala Glu Met Lys Thr Asp Ala Ala 
420 425 430 

Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He Ser Gly Asp Leu 
435 440 445 

Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly 
450 455 460 

Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala Ala Val Val Arg 
465 * 470 475 480 

Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu He Ser 
485 490 495 

Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu Glu 
500 505 510 

Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe Val Pro Thr Thr Ala 
515 520 525 

Ala Ser Pro Pro Ser Thr Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro 
530 535 540 

Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro 
545 550 555 560 

Glv Asp Pro Asn Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro 
565 570 575 

Pro Pro Val He Ala Pro Asn Ala Pro Gin Pro Val Arg He Asp Asn 
580 585 590 

Pro Val Gly Gly Phe Ser Phe Ala Leu Pro Ala Gly Trp Val Glu Ser 
595 600 605 

Asp Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr 
610 615 620 

Glv Asp Pro Pro Phe Pro Gly Gin Pro Pro Pro Val Ala Asn Asp Thr 
625 630 635 640 

Ara He Val Leu Gly Arg Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu 
y 64 5 650 655 

Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu 
660 665 670 

Phe Tyr Met Pro Tyr Pro Gly Thr Arg He Asn Gin Glu Thr Val Ser 
675 680 685 

Leu Asp Ala Asn Gly Val Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys 
690 " 695 700 

Phe Ser Asp Pro Ser Lys Pro Asn Gly Gin He Trp Thr Gly Val He 
705 710 715 720 

Glv Ser Pro Ala Ala Asn Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp 
725 730 735 

Phe Val Val Trp Leu Gly Thr Ala Asn Asn Pro Val Asp Lys Gly Ala 
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740 745 750 

Ala Lys Ala Leu Ala Glu Ser lie Arg Pro Leu Val Ala Pro Pro Pro 
755 760 765 

Ala Pro Ala Pro Ala Pro Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala 
770 775 780 

Gly Glu Val Ala Pro Thr Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu 
785 790 795 800 

Pro Ala 
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CLAIMS 

1. A polypeptide comprising an immunogenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val-Ala-Ala-Leu; (SEQ ID No. 120) 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser; 
(SEQ ID No. 121) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro; 
(SEQ ID No. 123) 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; (SEQ 
ID No. 124) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID No. 
125) 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thi-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser; (SEQ ID No. 126) 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly; 
(SEQ ID No. 127) 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn; (SEQ 
ID No. 128) and 

(j ) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr- Val-Gln-Ala-Gly ; 
(SEQ ID No. 136) 
wherein Xaa may be any amino acid. 

2. A polypeptide comprising an immunogenic portion of an 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
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substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Pro-Asp-Pro-His-GIn-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) and 

(b) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Th^ 

Asn-Val-His-Leu-Val; (SEQ ID No. 1 37), wherein Xaa may be any 
amino acid. 

3. A polypeptide comprising an immunogenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 99 and 101, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 
99 and 101 or a complement thereof under moderately stringent conditions. 

4. A polypeptide comprising an immunogenic portion of a 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID Nos.: 26-51, 138, 139, 163-183 and 201, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 26-51, 138, 139, 163- 
1 83 and 201 or a complement thereof under moderately stringent conditions. 

5. A DNA molecule comprising a nucleotide sequence encoding a 
polypeptide according to any one of claims 1-4. 

6. An expression vector comprising a DNA molecule according to 

claim 5. 

7. A host cell transformed with an expression vector according to claim 6. 



BNSDOCID: <WO 9S16646A2_I_> 



WO 98/16646 



PCT/US97/18293 



214 

8. The host cell of claim 7 wherein the host cell is selected from the group 
consisting of E. coli, yeast and mammalian cells. 

9. A pharmaceutical composition comprising one or more polypeptides 
according to any one of claims 1-4 and a physiologically acceptable carrier. 

10. A pharmaceutical composition comprising one or more DNA 
molecules according to claim 5 and a physiologically acceptable carrier. 

11. A pharmaceutical composition comprising one or more DNA 
sequences recited in SEQ ID Nos.: 3, 1 1, 12, 140 and 141; and a physiologically acceptable 
carrier. 

12. A vaccine comprising one or more polypeptides according to any one 
of claims 1 -4 and a non-specific immune response enhancer. 

13. A vaccine comprising: 

a polypeptide having an N-terminal sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 134 and 135; and 
a non-specific immune response enhancer. 

14. A vaccine comprising: 

one or more polypeptides encoded by a DNA sequence selected from the 
group consisting of SEQ ID Nos.: 3, 11, 12, 140 and 141, the complements of said 
sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 3, 1 1, 
12, 140 and 141; and 

a non-specific immune response enhancer. 

15. The vaccine of claims 12-14 wherein the non-specific immune 
response enhancer is an adjuvant. 
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16. A vaccine comprising one or more DNA molecules according lo claim 
5 and a non-specific immune response enhancer. 

17. A vaccine comprising one or more DNA sequences recited in SEQ ID 
Nos.: 3, 11, 12, 140 and 141 ; and a non-specific immune response enhancer. 

18. The vaccine of claims 16 or 17 wherein the non-specific immune 
response enhancer is an adjuvant. 

19. A pharmaceutical composition according to any one of claims 9-1 1, for 
use in the manufacture of a medicament for inducing protective immunity in a patient. 

20. A vaccine according to any one of claims 12-18, for use in the 
manufacture of a medicament for inducing protective immunity in a patient. 

21. A fusion protein comprising two or more polypeptides according to 
any one of claims 1-4. 

22. A fusion protein comprising one or more polypeptides according to 
any one of claims 1 -4 and ESAT-6. 

23. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and the M tuberculosis antigen 38 kD (SEQ ID NO: 155). 

24. A pharmaceutical composition comprising a fusion protein according 
to any one of claims 21-23 and a physiologically acceptable carrier. 

25. A vaccine comprising a fusion protein according to any one of claims 
21-23 and a non-specific immune response enhancer. 

26. The vaccine of claim 25 wherein the non-specific immune response 
enhancer is an adjuvant. 
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27. A pharmaceutical composition according to claim 24, for use in the 
manufacture of a medicament for inducing protective immunity in a patient. 

28. A vaccine according to claims 25 or 26, for use in the manufcture of a 
medicament for inducing protective immunity in a patient. 

29. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with one or more polypeptides 
according to any one of claims 1-4; and 

(b) detecting an immune response on the patient's skin and therefrom 
detecting tuberculosis in the patient. 

30. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with a polypeptide having an N- 
terminal sequence selected from the group consisting of sequences recited in SEQ ID NO: 
134 and 135; and 

(b) detecting an immune response on the patient's skin and therefrom 
detecting tuberculosis in the patient. 

31. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with one or more polypeptides 
encoded by a DNA sequence selected from the group consisting of SEQ ID Nos.: 3, 11, 12, 
140, 141, 156-160, 189-193, 199, 200 and 203, the complements of said sequences, and DNA 
sequences that hybridize to a sequence recited in SEQ ID Nos.: 3, 11, 12, 140, 141, 156-160, 
189-193, 199, 200 and 203; and 

(b) detecting an immune response on the patient's skin and therefrom 
detecting tuberculosis in the patient. 

32. The method of any one of claims 29-31 wherein the immune response 

is induration. 
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33. A diagnostic kit comprising: 

(a) a polypeptide according to any one of claims 1 -4; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 



34. A diagnostic kit comprising: 

(a) a polypeptide having an N-terminal sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 134 and 135; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 

35. A diagnostic kit comprising: 

(a) a polypeptide encoded by a DNA sequence selected from the group 
consisting of SEQ ID Nos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200 and 203, the 
complements of said sequences, and DNA sequences that hybridize to a sequence recited in 
SEQ ID Nos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200 and 203; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 

36. A diagnostic kit comprising: 

(a) a fusion protein according to any one of claims 21-23; and 

(b) apparatus sufficient to contact said fusion protein with the dermal cells of a 
patient. 



37. A fusion protein according to claim 23 comprising an amino acid 
sequence selected from the group consisting of sequences recited in SEQ ID NO: 153 and 
209. 
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1 

COMPOUNDS AND METHODS FOR IMMUNOTHERAPY 
AND DIAGNOSIS OF TUBERCULOSIS 



5 TECHNICAL FIELD 

The present invention relates generally to detecting, treating and 
preventing Mycobacterium tuberculosis infection. The invention is more particularly 
related to polypeptides comprising a Mycobacterium tuberculosis antigen, or a portion 
or other variant thereof, and the use of such polypeptides for diagnosing and vaccinating 
1 0 against Mycobacterium tuberculosis infection. 



BACKGROUND OF THE INVENTION 

Tuberculosis is a chronic, infectious disease, that is generally caused by 
infection with Mycobacterium tuberculosis. It is a major disease in developing 

15 countries, as well as an increasing problem in developed areas of the world, with about 
8 million new cases and 3 million deaths each year. Although the infection may be 
asymptomatic for a considerable period of time, the disease is most commonly 
manifested as an acute inflammation of the lungs, resulting in fever and a nonproductive 
cough. If left untreated, serious complications and death typically result. 

20 Although tuberculosis can generally be controlled using extended 

antibiotic therapy, such treatment is not sufficient to prevent the spread of the disease. 
Infected individuals may be asymptomatic, but contagious, for some time. In addition, 
although compliance with the treatment regimen is critical, patient behavior is difficult 
to monitor. Some patients do not complete the course of treatment, which can lead to 

25 ineffective treatment and the development of drug resistance. 

Inhibiting the spread of tuberculosis requires effective vaccination and 
accurate, early diagnosis of the disease. Currently, vaccination with live bacteria is the 
most efficient method for inducing protective immunity. The most common 
Mycobacterium employed for this purpose is Bacillus Calmette-Guerin (BCG), an 

30 avirulent strain of Mycobacterium bovis. However, the safety and efficacy of BCG is a 
source of controversy and some countries, such as the United States, do not vaccinate 
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the general public. Diagnosis is commonly achieved using a skin test, which involves 
intradermal exposure to tuberculin PPD (protein-purified derivative). Antigen-specific 
T cell responses result in measurable induration at the injection site by 48-72 hours after 
injection, which indicates exposure to Mycobacterial antigens. Sensitivity and 
5 specificity have, however, been a problem with this test, and individuals vaccinated 
with BCG cannot be distinguished from infected individuals. 

While macrophages have been shown to act as the principal effectors of 
M tuberculosis immunity, T cells are the predominant inducers of such immunity. The 
essential role of T cells in protection against M tuberculosis infection is illustrated by 

10 the frequent occurrence of M. tuberculosis in AIDS patients, due to the depletion of 
CD4 T cells associated with human immunodeficiency virus (HIV) infection. 
Mycobacterium-reactive CD4 T cells have been shown to be potent producers of 
gamma-interferon (IFN-y), which, in turn, has been shown to trigger the anti- 
mycobacterial effects of macrophages in mice. While the role of IFN-y in humans is 

15 less clear, studies have shown that 1,2 5 -dihydroxy- vitamin D3, either alone or in 
combination with IFN-y or tumor necrosis factor-alpha, activates human macrophages 
to inhibit M. tuberculosis infection. Furthermore, it is known that IFN-y stimulates 
human macrophages to make 1,25-dihydroxy -vitamin D3. Similarly, IL-12 has been 
shown to play a role in stimulating resistance to M tuberculosis infection. For a review 

20 of the immunology of M. tuberculosis infection see Chan and Kaufmann in 
Tuberculosis: Pathogenesis, Protection and Control, Bloom (ed.), ASM Press, 
Washington, DC, 1994. 

Accordingly, there is a need in the art for improved vaccines and 
methods for preventing, treating and detecting tuberculosis. The present invention 

25 fulfills these needs and further provides other related advantages. 

SUMMARY OF THE INVENTION 

Briefly stated, this invention provides compounds and methods for 
preventing and diagnosing tuberculosis. In one aspect, polypeptides are provided 
30 comprising an immunogenic portion of a soluble M tuberculosis antigen, or a variant of 
such an antigen that differs only in conservative substitutions and/or modifications. In 
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one embodiment of this aspect, the soluble antigen has one of the following N-terminal 
sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- 
Gln-Val-Val-Ala- Ala-Leu; (SEQ ID No. 120) 
5 (b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 

Ser; (SEQ ID No. 121) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Tlu-Gly-AsprGly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
10 Pro; (SEQ ID No. 123) 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; 
(SEQ ID No. 124) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ue-Val-Pro; (SEQ ID 
No. 125) 

1 5 (g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser- 

Pro-Pro-Ser; (SEQ ID No. 126) 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 
Gly; (SEQ ID No. 127) 

(i) Asp-Pro-AIa-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu- 
20 Thr- S er-Leu-Leu- Asn- Ser-Leu- Al a- Asp-Pro- Asn-Val- Ser-Phe- 

Ala-Asn; (SEQ ID No. 128) 
(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 

Ser; (SEQ ID No. 134) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 
25 Asp; (SEQ ID No. 135) or 

(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQ ID No. 136) 
wherein Xaa may be any amino acid. 

In a related aspect, polypeptides are provided comprising an 
30 immunogenic portion of an M. tuberculosis antigen, or a variant of such an antigen that 
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differs only in conservative substitutions and/or modifications, the antigen having one 
of the following N-terminal sequences: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID No. 137) or 
5 (n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- 

Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) 
wherein Xaa may be any amino acid. 

In another embodiment, the soluble M tuberculosis antigen comprises an 
amino acid sequence encoded by a DNA sequence selected from the group consisting of 
10 the sequences recited in SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 99 and 101, the 
complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 99 and 101 or a complement thereof 
under moderately stringent conditions. 

In a related aspect, the polypeptides comprise an immunogenic portion 
15 of a M tuberculosis antigen, or a variant of such an antigen that differs only in 
conservative substitutions and/or modifications, wherein the antigen comprises an 
amino acid sequence encoded by a DNA sequence selected from the group consisting of 
the sequences recited in SEQ ID Nos.: 26-51, 138, 139, 163-183 and 201, the 
complements of said sequences, and DNA sequences that hybridize to a sequence 
20 recited in SEQ ID Nos.: 26-51, 138, 139, 163-183 and 201 or a complement thereof 
under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 
expression vectors comprising these DNA sequences and host cells transformed or 
transfected with such expression vectors are also provided. 
25 In another aspect, the present invention provides fusion proteins 

comprising a first and a second inventive polypeptide or, alternatively, an inventive 
polypeptide and a known M. tuberculosis antigen. 

Within other aspects, the present invention provides pharmaceutical 
compositions that comprise one or more of the above polypeptides, or a DNA molecule 
30 encoding such polypeptides, and a physiologically acceptable carrier. The invention 
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also provides vaccines comprising one or more of the polypeptides as described above 
and a non-specific immune response enhancer, together with vaccines comprising one 
or more DNA sequences encoding such polypeptides and a non-specific immune 
response enhancer. 

5 In yet another aspect, methods are provided for inducing protective 

immunity in a patient, comprising administering to a patient an effective amount of one 
or more of the above polypeptides. 

In further aspects of this invention, methods and diagnostic kits are 
provided for detecting tuberculosis in a patient. The methods comprise contacting 

10 dermal cells of a patient with one or more of the above polypeptides and detecting an 
immune response on the patient's skin. The diagnostic kits comprise one or more of the 
above polypeptides in combination with an apparatus sufficient to contact the 
polypeptide with the dermal cells of a patient. 

In yet other aspects, methods are provided for detecting tuberculosis in a 

1 5 patient, such methods comprising contacting dermal cells of a patient with one or more 
polypeptides encoded by a DNA sequence selected from the group consisting of SEQ 
IDNos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200 and 203, the complements of 
said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID 
Nos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200 and 203; and detecting an 

20 immune response on the patient's skin. Diagnostic kits for use in such methods are also 
provided. 

These and other aspects of the present invention will become apparent 
upon reference to the following detailed description and attached drawings. All 
25 references disclosed herein are hereby incorporated by reference in their entirety as if 
each was incorporated individually. 

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE IDENTIFIERS 

Figure 1 A and B illustrate the stimulation of proliferation and interferon- 
30 y production in T cells derived from a first and a second M tuberculosis- immune donor, 
respectively, by the 14 Kd, 20 Kd and 26 Kd antigens described in Example 1 . 
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Figure 2 illustrates the stimulation of proliferation and interferon-y 
production in T cells derived from an M tuberculosis-immune individual by the two 
representative polypeptides TbRa3 and TbRa9. 

Figures 3A-D illustrate the reactivity of antisera raised against secretory 
5 M. tuberculosis proteins, the known M tuberculosis antigen 85b and the inventive 
antigens Tb38-1 and TbH-9, respectively, with M tuberculosis lysate (lane 2), M 
tuberculosis secretory proteins (lane 3), recombinant Tb38-1 (lane 4), recombinant 
TbH-9 (lane 5) and recombinant 85b (lane 5). 

Figure 4A illustrates the stimulation of proliferation in a TbH-9-specific 
10 T cell clone by secretory M tuberculosis proteins, recombinant TbH-9 and a control 
antigen, TbRalL 

Figure 4B illustrates the stimulation of interferon-y production in a TbH- 
9-specific T cell clone by secretory M. tuberculosis proteins, PPD and recombinant 
TbH-9. 

15 Figures 5 A and B illustrate the stimulation of proliferation and 

interferon-y production in TbH9-specific T cells by the fusion protein TbH9-Tb38-l. 

Figures 6A and B illustrate the stimulation of proliferation and 
interferon-y production in Tb3 8-1 -specific T cells by the fusion protein TbH9-Tb38-L 

Figures 7A and B illustrate the stimulation of proliferation and 
20 interferon-y production in T cells previously shown to respond to both TbH-9 and Tb38- 
1 by the fusion protein TbH9-Tb38-l. 

Figures 8A and B illustrate the stimulation of proliferation and 
interferon-y production in T cells derived from a first M tuberculosis-immune 
individual by the representative polypeptides XP-1, RDIF6, RDIF8, RDIF10 and 
25 RDIF11. 

Figures 9A and B illustrate the stimulation of proliferation and 
interferon-y production in T cells derived from a second M tuberculosis-immune 
individual by the representative polypeptides XP-1, RDIF6, RDIF8, RDIF10 and 
RDIF11. 

30 
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SEQ. ID NO. 1 is the DNA sequence of TbRal. 
SEQ. ID NO. 2 is the DNA sequence of TbRal 0. 
SEQ. ID NO. 3 is the DNA sequence of TbRal 1 . 
SEQ. ID NO. 4 is the DNA sequence of TbRal2. 
5 SEQ. ID NO. 5 is the DNA sequence of TbRal 3. 

SEQ. ID NO. 6 is the DNA sequence of TbRal 6. 
SEQ. ID NO. 7 is the DNA sequence of TbRal 7. 
SEQ. ID NO. 8 is the DNA sequence of TbRal 8. 
SEQ. ID NO. 9 is the DNA sequence of TbRal 9. 



10 


SEQ. 


ID NO. 


10 


is the 


DNA 


sequence 


of TbRa24. 
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SEQ. ID NO. 91 is the deduced amino acid sequence of TbH-9. 

SEQ. ID NO. 92 is the deduced amino acid sequence of TbH-12. 

SEQ. ID NO. 93 is the amino acid sequence of Tb38-1 Peptide 1 . 

SEQ. ID NO. 94 is the amino acid sequence of Tb38-1 Peptide 2. 

SEQ. ID NO. 95 is the amino acid sequence of Tb38-1 Peptide 3. 

SEQ. ID NO. 96 is the amino acid sequence of Tb38-1 Peptide 4. 

SEQ. ID NO. 97 is the amino acid sequence of Tb38-1 Peptide 5. 

SEQ. ID NO. 98 is the amino acid sequence of Tb38-1 Peptide 6. 

SEQ. ID NO. 99 is the DNA sequence of DP AS. 

SEQ. ID NO. 100 is the deduced amino acid sequence of DP AS. 

SEQ. ID NO. 101 is the DNA sequence of DPV. 

SEQ. ID NO. 102 is the deduced amino acid sequence of DPV. 

SEQ. ID NO. 103 is the DNA sequence of ESAT-6. 

SEQ. ID NO. 104 is the deduced amino acid sequence of ESAT-6. 

SEQ. ID NO. 105 is the DNA sequence of TbH-8-2. 

SEQ. ID NO. 106 is the DNA sequence of TbH-9FL. 

SEQ. ID NO. 107 is the deduced amino acid sequence of TbH-9FL. 

SEQ. ID NO. 108 is the DNA sequence of TbH-9- 1 . 

SEQ. ID NO. 109 is the deduced amino acid sequence of TbH-9- 1. 

SEQ. ID NO. 1 10 is the DNA sequence of TbH-9-4. 

SEQ. ID NO. 1 1 1 is the deduced amino acid sequence of TbH-9-4. 

SEQ. ID NO. 1 12 is the DNA sequence of Tb38-1F2 IN. 

SEQ. ID NO. 1 13 is the DNA sequence of Tb38-2F2 RP. 

SEQ. ID NO. 1 14 is the deduced amino acid sequence of Tb37-FL. 

SEQ. ID NO. 1 15 is the deduced amino acid sequence of Tb38-IN. 

SEQ. ID NO. 1 16 is the DNA sequence of Tb38-1F3. 

SEQ. ID NO. 1 17 is the deduced amino acid sequence of Tb38-1F3. 

SEQ. ID NO. 1 18 is the DNA sequence of Tb38-1F5. 

SEQ. ID NO. 1 19 is the DNA sequence of Tb38-1F6. 

SEQ. ID NO. 120 is the deduced N-terminal amino acid sequence of DPV. 
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SEQ. ID NO. 121 is the deduced N-terminal amino acid sequence of AVGS. 
SEQ. ID NO. 122 is the deduced N-terminal amino acid sequence of AAMK. 
SEQ. ID NO. 123 is the deduced N-terminal amino acid sequence of YYWC. 
SEQ. ID NO. 124 is the deduced N-terminal amino acid sequence of DIGS. 
5 SEQ. ID NO. 125 is the deduced N-terminal amino acid sequence of AEES. 

SEQ. ID NO. 126 is the deduced N-terminal amino acid sequence of DPEP. 
SEQ. ID NO. 127 is the deduced N-terminal amino acid sequence of APKT. 
SEQ. ID NO. 128 is the deduced amino acid sequence of DPAS. 
SEQ. ID NO. 129 is the protein sequence of DPPD N-terminal Antigen. 
10 SEQ ID NO. 130-133 are the protein sequences of four DPPD cyanogen 

bromide fragments. 

SEQ ID NO. 134 is the N-terminal protein sequence of XDS antigen. 

SEQ ID NO. 135 is the N-terminal protein sequence of AGD antigen. 

SEQ ID NO. 136 is the N-terminal protein sequence of APE antigen. 
15 SEQ ID NO. 137 is the N-terminal protein sequence of XYI antigen. 

SEQ ID NO. 138 is the DNA sequence of TbH-29. 

SEQ ID NO. 139 is the DNA sequence of TbH-30. 

SEQ ID NO. 140 is the DNA sequence of TbH-32. 

SEQ ID NO. 141 is the DNA sequence of TbH-33. 
20 SEQ ID NO. 142 is the predicted amino acid sequence of TbH-29. 

SEQ ID NO. 143 is the predicted amino acid sequence of TbH-30. 

SEQ ID NO. 144 is the predicted amino acid sequence of TbH-32. 

SEQ ID NO. 145 is the predicted amino acid sequence of TbH-33. 

SEQ ID NO: 146-151 are PCR primers used in the preparation of a fusion 
25 protein containing TbRa3, 38 kD and Tb38-1 . 

SEQ ID NO: 152 is the DNA sequence of the fusion protein containing TbRa3 5 

38kDandTb38-l. 

SEQ ID NO: 153 is the amino acid sequence of the fusion protein containing 
TbRa3,38kDandTb38-l. 
30 SEQ ID NO: 1 54 is the DNA sequence of the M tuberculosis antigen 38 kD. 
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SEQ ID NO: 155 is the amino acid sequence of the M. tuberculosis antigen 38 
kD. 

SEQ ID NO: 156 is the DNA sequence of XP14. 

SEQ ID NO: 1 57 is the DNA sequence of XP24. 
5 SEQ ID NO: 158 is the DNA sequence of XP31. 

SEQ ID NO: 159 is the 5' DNA sequence of XP32. 

SEQ ID NO: 160 is the 3' DNA sequence of XP32. 

SEQ ID NO: 161 is the predicted amino acid sequence of XP14. 

SEQ ID NO: 162 is the predicted amino acid sequence encoded by the reverse 
1 0 complement of XP 1 4. 

SEQ ID NO: 1 63 is the DNA sequence of XP27. 

SEQ ID NO: 164 is the DNA sequence of XP36. 

SEQ ID NO: 165 is the 5' DNA sequence of XP4. 

SEQ ID NO: 166 is the 5' DNA sequence of XP5. 
1 5 SEQ ID NO: 1 67 is the 5 ' DNA sequence of XP 1 7. 

SEQ ID NO: 168 is the 5' DNA sequence of XP30. 

SEQ ID NO: 1 69 is the 5' DNA sequence of XP2. 

SEQ ID NO: 1 70 is the 3' DNA sequence of XP2. 

SEQ ID NO: 171 is the 5' DNA sequence of XP3. 
20 SEQ ID NO: 1 72 is the 3 ' DNA sequence of XP3 . 

SEQ ID NO: 173 is the 5' DNA sequence of XP6. 

SEQ ID NO: 1 74 is the 3' DNA sequence of XP6. 

SEQ ID NO: 175 is the 5' DNA sequence of XP18. 

SEQ ID NO: 1 76 is the 3 ' DNA sequence of XP 1 8. 
25 SEQ ID NO: 177 is the 5' DNA sequence of XP19. 

SEQ ID NO: 178 is the 3' DNA sequence of XP19. 

SEQ ID NO: 179 is the 5' DNA sequence of XP22. 

SEQ ID NO: 1 80 is the 3' DNA sequence of XP22. 

SEQ ID NO: 1 81 is the 5' DNA sequence of XP25. 
30 SEQ ID NO: 1 82 is the 3' DNA sequence of XP25. 
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SEQ ID NO: 1 83 is the full-length DNA sequence of TbH4-XPl . 
SEQ ID NO: 1 84 is the predicted amino acid sequence of TbH4-XPL 
SEQ ID NO: 185 is the predicted amino acid sequence encoded by the reverse 
complement of TbH4-XP 1 . 
5 SEQ ID NO: 1 86 is a first predicted amino acid sequence encoded by XP36. 

SEQ ID NO: 187 is a second predicted amino acid sequence encoded by XP36. 
SEQ ID NO: 188 is the predicted amino acid sequence encoded by the reverse 
complement of XP36. 
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SEQ ID NO: 
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is the predicted amino acid sequence 


of RDIF5. 
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is the predicted amino acid sequence 


of RDIF8. 
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is the predicted amino acid sequence 


ofRDIFlO. 
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is the 5' DNA sequence of RDIF12. 
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is the 3' DNA sequence of RDIF12. 
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is the DNA sequence of RDIF7. 
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of RDIF7. 




SEQ ID NO: 


203 


is the DNA sequence of DIF2-1. 






SEQ ID NO: 


204 


is the predicted amino acid sequence 


ofDIF2-l. 



25 SEQ ID NO: 205-212 are PCR primers used in the preparation of a fusion 

protein containing TbRa3, 38 kD, Tb38-1 and DPEP (hereinafter referred to as 
TbF-2). 

SEQ ID NO: 213 is the DNA sequence of the fusion protein TbF-2. 
SEQ ID NO: 214 is the amino acid sequence of the fusion protein TbF-2. 

30 
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DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is generally directed to 
compositions and methods for preventing, treating and diagnosing tuberculosis. The 
compositions of the subject invention include polypeptides that comprise at least one 

5 immunogenic portion of a M. tuberculosis antigen, or a variant of such an antigen that 
differs only in conservative substitutions and/or modifications. Polypeptides within the 
scope of the present invention include, but are not limited to, immunogenic soluble 
M tuberculosis antigens. A "soluble M. tuberculosis antigen 1 ' is a protein of 
M. tuberculosis origin that is present in M tuberculosis culture filtrate. As used herein, 

10 the term "polypeptide" encompasses amino acid chains of any length, including full 
length proteins (i.e., antigens), wherein the amino acid residues are linked by covalent 
peptide bonds. Thus, a polypeptide comprising an immunogenic portion of one of the 
above antigens may consist entirely of the immunogenic portion, or may contain 
additional sequences. The additional sequences may be derived from the native 

15 M tuberculosis antigen or may be heterologous, and such sequences may (but need not) 
be immunogenic. 

"Immunogenic," as used herein, refers to the ability to elicit an immune 
response (e.g., cellular) in a patient, such as a human, and/or in a biological sample. In 
particular, antigens that are immunogenic (and immunogenic portions or other variants 

20 of such antigens) are capable of stimulating cell proliferation, interleukin-12 production 
and/or interferon-y production in biological samples comprising one or more cells 
selected from the group of T cells, NK cells, B cells and macrophages, where the cells 
are derived from an M tuberculosis-immune individual. Polypeptides comprising at 
least an immunogenic portion of one or more M tuberculosis antigens may generally be 

25 used to detect tuberculosis or to induce protective immunity against tuberculosis in a 
patient. 

The compositions and methods of this invention also encompass variants 
of the above polypeptides. A "variant," as used herein, is a polypeptide that differs 
from the native antigen only in conservative substitutions and/or modifications, such 
30 that the ability of the polypeptide to induce an immune response is retained. Such 
variants may generally be identified by modifying one of the above polypeptide 
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sequences, and evaluating the immunogenic properties of the modified polypeptide 
using, for example, the representative procedures described herein. 

A "conservative substitution" is one in which an amino acid is 
substituted for another amino acid that has similar properties, such that one skilled in 
5 the art of peptide chemistry would expect the secondary structure and hydropathic 
nature of the polypeptide to be substantially unchanged. In general, the following 
groups of amino acids represent conservative changes: (1) ala, pro, gly, glu, asp, gin, 
asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, ile, leu, met, ala, phe; (4) lys, arg, his; and 
(5) phe, tyr, trp, his. 

10 Variants may also (or alternatively) be modified by, for example, the 

. deletion or addition of amino acids that have minimal influence on the immunogenic 
properties, secondary structure and hydropathic nature of the polypeptide. For example, 
a polypeptide may be conjugated to a signal (or leader) sequence at the N-terminal end 
of the protein which co-translationally or post-translationally directs transfer of the 

15 protein. The polypeptide may also be conjugated to a linker or other sequence for ease 
of synthesis, purification or identification of the polypeptide (e.g., poly-His), or to 
enhance binding of the polypeptide to a solid support. For example, a polypeptide may 
be conjugated to an immunoglobulin Fc region. 

In a related aspect, combination polypeptides are disclosed. A 

20 "combination polypeptide" is a polypeptide comprising at least one of the above 
immunogenic portions and one or more additional immunogenic M. tuberculosis 
sequences, which are joined via a peptide linkage into a single amino acid chain. The 
sequences may be joined directly (i.e., with no intervening amino acids) or may be 
joined by way of a linker sequence (e.g., Gly-Cys-Gly) that does not significantly 

25 diminish the immunogenic properties of the component polypeptides. 

In general, M. tuberculosis antigens, and DNA sequences encoding such 
antigens, may be prepared using any of a variety of procedures. For example, soluble 
antigens may be isolated from M. tuberculosis culture filtrate by procedures known to 
those of ordinary skill in the art, including anion-exchange and reverse phase 

30 chromatography. Purified antigens are then evaluated for their ability to elicit an 
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appropriate immune response (e.g., cellular) using, for example, the representative 
methods described herein. Immunogenic antigens may then be partially sequenced 
using techniques such as traditional Edman chemistry. See Edman and Berg, Eur. J. 
Biochem. 50:116-132, 1967. 
5 Immunogenic antigens may also be produced recombinantly using a 

DNA sequence that encodes the antigen, which has been inserted into an expression 
vector and expressed in an appropriate host. DNA molecules encoding soluble antigens 
may be isolated by screening an appropriate M. tuberculosis expression library with 
anti-sera (e.g., rabbit) raised specifically against soluble M. tuberculosis antigens. DNA 

10 sequences encoding antigens that may or may not be soluble may be identified by 
screening an appropriate M. tuberculosis genomic or cDNA expression library with sera 
obtained from patients infected with M tuberculosis. Such screens may generally be 
performed using techniques well known to those of ordinary skill in the art, such as 
those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold 

1 5 Spring Harbor Laboratories, Cold Spring Harbor, NY, 1 989. 

DNA sequences encoding soluble antigens may also be obtained by 
screening an appropriate M. tuberculosis cDNA or genomic DNA library for DNA 
sequences that hybridize to degenerate oligonucleotides derived from partial amino acid 
sequences of isolated soluble antigens. Degenerate oligonucleotide sequences for use in 

20 such a screen may be designed and synthesized, and the screen may be performed, as 
described (for example) in Sambrook et al., Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989 (and references cited 
therein). Polymerase chain reaction (PGR) may also be employed, using the above 
oligonucleotides in methods well known in the art, to isolate a nucleic acid probe from a 

25 cDNA or genomic library. The library screen may then be performed using the isolated 
probe. 

Alternatively, genomic or cDNA libraries derived from M. tuberculosis 
may be screened directly using peripheral blood mononuclear cells (PBMCs) or T cell 
lines or clones derived from one or more M tuberculosis-immune individuals. In 
30 general, PBMCs and/or T cells for use in such screens may be prepared as described 



BNSDOCID: <WO 9816646A2_IA> 



WO 98/16646 



PCT/US97/18293 



17 

below. Direct library screens may generally be performed by assaying pools of 
expressed recombinant proteins for the ability to induce proliferation and/or interferon-y 
production in T cells derived from an M tuberculosis-immune individual. 
Alternatively, potential T cell antigens may be first selected based on antibody 
5 reactivity, as described above. 

Regardless of the method of preparation, the antigens (and immunogenic 
portions thereof) described herein (which may or may not be soluble) have the ability to 
induce an immunogenic response. More specifically, the antigens have the ability to 
induce proliferation and/or cytokine production (i.e., interferon-y and/or interleukin- 1 2 

10 production) in T cells, NK cells, B cells and/or macrophages derived from an 
M tuberculosis-immune individual. The selection of cell type for use in evaluating an 
immunogenic response to a antigen will, of course, depend on the desired response. For 
example, interleukin- 12 production is most readily evaluated using preparations 
containing B cells and/or macrophages. An M. tuberculosis-immune individual is one 

15 who is considered to be resistant to the development of tuberculosis by virtue of having 
mounted an effective T cell response to M. tuberculosis (i.e., substantially free of 
disease symptoms). Such individuals may be identified based on a strongly positive 
(i.e., greater than about 10 mm diameter induration) intradermal skin test response to 
tuberculosis proteins (PPD) and an absence of any signs or symptoms of tuberculosis 

20 disease. T cells, NK cells, B cells and macrophages derived from M. tuberculosis- 
immune individuals may be prepared using methods known to those of ordinary skill in 
the art. For example, a preparation of PBMCs (i.e., peripheral blood mononuclear cells) 
may be employed without further separation of component cells. PBMCs may 
generally be prepared, for example, using density centrifugation through Ficoll™ 

25 (Winthrop Laboratories, NY). T cells for use in the assays described herein may also be 
purified directly from PBMCs. Alternatively, an enriched T cell line reactive against 
mycobacterial proteins, or T cell clones reactive to individual mycobacterial proteins, 
may be employed. Such T cell clones may be generated by, for example, culturing 
PBMCs from M tuberculosis-immune individuals with mycobacterial proteins for a 

30 period of 2-4 weeks. This allows expansion of only the mycobacterial protein-specific 



BNSDOCID: <WO 9816646A2_IA> 



WO 98/16646 



PCT/US97/18293 



18 



T cells, resulting in a line composed solely of such cells. These cells may then be 
cloned and tested with individual proteins, using methods known to those of ordinary 
skill in the art, to more accurately define individual T cell specificity. In general, 
antigens that test positive in assays for proliferation and/or cytokine production (i.e., 
5 interferon-y and/or interleukin- 1 2 production) performed using T cells, NK cells, B cells 
and/or macrophages derived from an M. tuberculosis-immune individual are considered 
immunogenic. Such assays may be performed, for example, using the representative 
procedures described below. Immunogenic portions of such antigens may be identified 
using similar assays, and may be present within the polypeptides described herein. 
10 The ability of a polypeptide (e.g., an immunogenic antigen, or a portion 

or other variant thereof) to induce cell proliferation is evaluated by contacting the cells 
(e.g., T cells and/or NK cells) with the polypeptide and measuring the proliferation of 
the cells. In general, the amount of polypeptide that is sufficient for evaluation of about 
10 5 cells ranges from about lOng/mL to about 100u.g/mL and preferably is about 
15 10 ng/mL. The incubation of polypeptide with cells is typically performed at 37°C for 
about six days. Following incubation with polypeptide, the cells are assayed for a 
proliferative response, which may be evaluated by methods known to those of ordinary 
skill in the art, such as exposing cells to a pulse of radiolabeled thymidine and 
measuring the incorporation of label into cellular DNA. In general, a polypeptide that 
20 results in at least a three fold increase in proliferation above background (i.e., the 
proliferation observed for cells cultured without polypeptide) is considered to be able to 

induce proliferation. 

The ability of a polypeptide to stimulate the production of interferon-y 
and/or interleukin- 12 in cells may be evaluated by contacting the cells with the 

25 polypeptide and measuring the level of interferon-y or interleukin- 12 produced by the 
cells. In general, the amount of polypeptide that is sufficient for the evaluation of about 
10 5 cells ranges from about lOng/mL to about lOOug/mL and preferably is about 
10 ng/mL. The polypeptide may, but need not, be immobilized on a solid support, such 
as a bead or a biodegradable microsphere, such as those described in U.S. Patent 

30 Nos. 4,897,268 and 5,075,109. The incubation of polypeptide with the cells is typically 
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performed at 37°C for about six days. Following incubation with polypeptide, the cells 
are assayed for interferon-y and/or interleukin-12 (or one or more subunits thereof), 
which may be evaluated by methods known to those of ordinary skill in the art, such as 
an enzyme-linked immunosorbent assay (ELISA) or, in the case of IL-12 P70 subunit, a 
5 bioassay such as an assay measuring proliferation of T cells. In general, a polypeptide 
that results in the production of at least 50 pg of interferon-y per mL of cultured 
supernatant (containing 10 4 -10 5 T cells per mL) is considered able to stimulate the 
production of interferon-y. A polypeptide that stimulates the production of at least 
10 pg/mL of IL-12 P70 subunit, and/or at least 100 pg/mL of IL-12 P40 subunit, per 10 5 

10 macrophages or B cells (or per 3 x 10 5 PBMC) is considered able to stimulate the 
production of IL-12. 

In general, immunogenic antigens are those antigens that stimulate 
proliferation and/or cytokine production {i.e., interferon-y and/or interleukin-12 
production) in T cells, NK cells, B cells and/or macrophages derived from at least about 

15 25% of M tuberculosis-immune individuals. Among these immunogenic antigens, 
polypeptides having superior therapeutic properties may be distinguished based on the 
magnitude of the responses in the above assays and based on the percentage of 
individuals for which a response is observed. In addition, antigens having superior 
therapeutic properties will not stimulate proliferation and/or cytokine production in 

20 vitro in cells derived from more than about 25% of individuals that are not 
M tuberculosis-immune, thereby eliminating responses that are not specifically due to 
M. tuberculosis-responsive cells. Those antigens that induce a response in a high 
percentage of T cell, NK cell, B cell and/or macrophage preparations from 
M tuberculosis-immune individuals (with a low incidence of responses in cell 

25 preparations from other individuals) have superior therapeutic properties. 

Antigens with superior therapeutic properties may also be identified 
based on their ability to diminish the severity of M. tuberculosis infection in 
experimental animals, when administered as a vaccine. Suitable vaccine preparations 
for use on experimental animals are described in detail below. Efficacy may be 

30 determined based on the ability of the antigen to provide at least about a 50% reduction 
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in bacterial numbers and/or at least about a 40% decrease in mortality following 
experimental infection. Suitable experimental animals include mice, guinea pigs and 
primates. 

Antigens having superior diagnostic properties may generally be 
5 identified based on the ability to elicit a response in an intradermal skin test performed 
on an individual with active tuberculosis, but not in a test performed on an individual 
who is not infected with M. tuberculosis. Skin tests may generally be performed as 
described below, with a response of at least 5 mm induration considered positive. 

Immunogenic portions of the antigens described herein may be prepared 

10 and identified using well known techniques, such as those summarized in Paul, 
Fundamental Immunology^ 3d ed., Raven Press, 1993, pp. 243-247 and references cited 
therein. Such techniques include screening polypeptide portions of the native antigen 
for immunogenic properties. The representative proliferation and cytokine production 
assays described herein may generally be employed in these screens. An immunogenic 

1 5 portion of a polypeptide is a portion that, within such representative assays, generates 
an immune response (e.g., proliferation, interferon-y production and/or interleukin-12 
production) that is substantially similar to that generated by the fiill length antigen. In 
other words, an immunogenic portion of an antigen may generate at least about 20%, 
and preferably about 100%, of the proliferation induced by the full length antigen in the 

20 model proliferation assay described herein. An immunogenic portion may also, or 
alternatively, stimulate the production of at least about 20%, and preferably about 
100%, of the interferon-y and/or interleukin-12 induced by the full length antigen in the 
model assay described herein. 

Portions and other variants of M tuberculosis antigens may be generated 

25 by synthetic or recombinant means. Synthetic polypeptides having fewer than about 
100 amino acids, and generally fewer than about 50 amino acids, may be generated 
using techniques well known to those of ordinary skill in the art. For example, such 
polypeptides may be synthesized using any of the commercially available solid-phase 
techniques, such as the Merrifield solid-phase synthesis method, where amino acids are 

30 sequentially added to a growing amino acid chain. See Merrifield, J. Am. Chem. Soc. 
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85:2 149-2 146, 1963. Equipment for automated synthesis of polypeptides is 
commercially available from suppliers such as Applied BioSystems, Inc., Foster City, 
CA, and may be operated according to the manufacturer's instructions. Variants of a 
native antigen may generally be prepared using standard mutagenesis techniques, such 
5 as oligonucleotide-directed site-specific mutagenesis. Sections of the DNA sequence 
may also be removed using standard techniques to permit preparation of truncated 
polypeptides. 

Recombinant polypeptides containing portions and/or variants of a 
native antigen may be readily prepared from a DNA sequence encoding the polypeptide 

1 0 using a variety of techniques well known to those of ordinary skill in the art. For 
example, supernatants from suitable host/vector systems which secrete recombinant 
protein into culture media may be first concentrated using a commercially available 
filter. Following concentration, the concentrate may be applied to a suitable 
purification matrix such as an affinity matrix or an ion exchange resin. Finally, one or 

15 more reverse phase HPLC steps can be employed to further purify a recombinant 
protein. 

Any of a variety of expression vectors known to those of ordinary skill in 
the art may be employed to express recombinant polypeptides of this invention. 
Expression may be achieved in any appropriate host cell that has been transformed or 

20 transfected with an expression vector containing a DNA molecule that encodes a 
recombinant polypeptide. Suitable host cells include prokaryotes, yeast and higher 
eukaryotic cells. Preferably, the host cells employed are E. coli, yeast or a mammalian 
cell line such as COS or CHO. The DNA sequences expressed in this manner may 
encode naturally occurring antigens, portions of naturally occurring antigens, or other 

25 variants thereof 

In general, regardless of the method of preparation, the polypeptides 
disclosed herein are prepared in substantially pure form. Preferably, the polypeptides 
are at least about 80% pure, more preferably at least about 90% pure and most 
preferably at least about 99% pure. In certain preferred embodiments, described in 
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detail below, the substantially pure polypeptides are incorporated into pharmaceutical 
compositions or vaccines for use in one or more of the methods disclosed herein. 

In certain specific embodiments, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of a soluble M. tuberculosis 
5 antigen having one of the following N-terminal sequences, or a variant thereof that 
differs only in conservative substitutions and/or modifications: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- 
Gln-Val-Val-Ala- Ala-Leu; (SEQ ID No. 120) 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 
10 Ser; (SEQ ID No. 121) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
Pro; (SEQ ID No. 123) 

15 ( e ) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; 

(SEQ ID No. 124) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID 
No. 125) 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Ala-Ala-Ala-Ser- 
20 Pro-Pro-Ser; (SEQ ID No. 126) 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 
Gly; (SEQ ID No. 127) 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu- 
Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- 

25 Ala-Asn; (SEQ ID No. 1 28) 

(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 

Ser; (SEQ ID No. 134) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 

Asp; (SEQ ID No. 135) or 
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(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQ ID No. 136) 
wherein Xaa may be any amino acid, preferably a cysteine residue. A DNA sequence 
encoding the antigen identified as (g) above is provided in SEQ ID No. 52, and the 
5 polypeptide encoded by SEQ ID No. 52 is provided in SEQ ID No. 53. A DNA 
sequence encoding the antigen defined as (a) above is provided in SEQ ID No. 101; its 
deduced amino acid sequence is provided in SEQ ID No. 102. A DNA sequence 
corresponding to antigen (d) above is provided in SEQ ID No. 24 a DNA sequence 
corresponding to antigen (c) is provided in SEQ ID No. 25 and a DNA sequence 
1 0 corresponding to antigen (i) is provided in SEQ ID No. 99; its deduced amino acid 
sequence is provided in SEQ ID No. 100. 

In a further specific embodiment, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of an M. tuberculosis antigen 
having one of the following N-terminal sequences, or a variant thereof that differs only 
1 5 in conservative substitutions and/or modifications: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 

Ile-Asn-Val-His-Leu-Val; (SEQ ID No 137) or 
(n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- 
Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) 
20 wherein Xaa may be any amino acid, preferably a cysteine residue. 

In other specific embodiments, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of a soluble M tuberculosis 
antigen (or a variant of such an antigen) that comprises one or more of the amino acid 
sequences encoded by (a) the DNA sequences of SEQ ID Nos.: 1, 2, 4-10, 13-25 and 
25 52; (b) the complements of such DNA sequences, or (c) DNA sequences substantially 
homologous to a sequence in (a) or (b). 

In further specific embodiments, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of a M. tuberculosis antigen 
(or a variant of such an antigen), which may or may not be soluble, that comprises one 
30 or more of the amino acid sequences encoded by (a) the DNA sequences of SEQ ID 



PCT/US97/18293 



BNSDOCIO: <WO 9816646A2JA> 



PCT/US97/18293 

WO 98/16646 

24 

Nos.: 26-51, 138, 139, 163-183 and 201, (b) the complements of such DNA sequences 
or (c) DNA sequences substantially homologous to a sequence in (a) or (b). 

In the specific embodiments discussed above, the M. tuberculosis 
antigens include variants that are encoded by DNA sequences which are substantially 
5 homologous to one or more of DNA sequences specifically recited herein. "Substantial 
homology," as used herein, refers to DNA sequences that are capable of hybridizing 
under moderately stringent conditions. Suitable moderately stringent conditions include 
prewashing in a solution of 5X SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing 
at 50°C-65°C, 5X SSC, overnight or, in the case of cross-species homology at 45°C, 
10 0.5X SSC; followed by washing twice at 65°C for 20 minutes with each of 2X, 0.5X 
and 0.2X SSC containing 0.1% SDS). Such hybridizing DNA sequences are also 
within the scope of this invention, as are nucleotide sequences that, due to code 
degeneracy, encode an immunogenic polypeptide that is encoded by a hybridizing DNA 
sequence. 

j 5 i n a related aspect, the present invention provides fusion proteins 

comprising a first and a second inventive polypeptide or, alternatively, a polypeptide of 
the present invention and a known M tuberculosis antigen, such as the 38 kD antigen 
described in Andersen and Hansen, Infect. Immun. 57:2481-2488, 1989, (Genbank 
Accession No. M30046) or ESAT-6 (SEQ ID Nos. 103 and 104), together with variants 
20 of such fusion proteins. The fusion proteins of the present invention may also include a 
linker peptide between the first and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 
constructed using known recombinant DNA techniques to assemble separate DNA 
sequences encoding the first and second polypeptides into an appropriate expression 
25 vector. The 3' end of a DNA sequence encoding the first polypeptide is ligated, with or 
without a peptide linker, to the 5' end of a DNA sequence encoding the second 
polypeptide so that the reading frames of the sequences are in phase to permit mRNA 
translation of the two DNA sequences into a single fusion protein that retains the 
biological activity of both the first and the second polypeptides. 
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A peptide linker sequence may be employed to separate the first and the 
second polypeptides by a distance sufficient to ensure that each polypeptide folds into 
its secondary and tertiary structures. Such a peptide linker sequence is incorporated into 
the fusion protein using standard techniques well known in the art. Suitable peptide 
5 linker sequences may be chosen based on the following factors: (1) their ability to 
adopt a flexible extended conformation; (2) their inability to adopt a secondary structure 
that could interact with functional epitopes on the first and second polypeptides; and 
(3) the lack of hydrophobic or charged residues that might react with the polypeptide 
functional epitopes. Preferred peptide linker sequences contain Gly, Asn and Ser 

1 0 residues. Other near neutral amino acids, such as Thr and Ala may also be used in the 
linker sequence. Amino acid sequences which may be usefully employed as linkers 
include those disclosed in Maratea etal., Gene 40:39-46, 1985; Murphy etal., Proc. 
Natl. Acad Scl USA 55:8258-8262, 1986; U.S. Patent No. 4,935,233 and U.S. Patent 
No. 4,751,180. The linker sequence may be from 1 to about 50 amino acids in length. 

15 Peptide sequences are not required when the first and second polypeptides have non- 
essential N-terminal amino acid regions that can be used to separate the functional 
domains and prevent steric interference. 

The ligated DNA sequences are operably linked to suitable 
transcriptional or translational regulatory elements. The regulatory elements 

20 responsible for expression of DNA are located only 5' to the DNA sequence encoding 
the first polypeptides. Similarly, stop codons require to end translation and 
transcription termination signals are only present 3' to the DNA sequence encoding the 
second polypeptide. 

In another aspect, the present invention provides methods for using one 

25 or more of the above polypeptides or fusion proteins (or DNA molecules encoding such 
polypeptides) to induce protective immunity against tuberculosis in a patient. As used 
herein, a "patient" refers to any warm-blooded animal, preferably a human. A patient 
may be afflicted with a disease, or may be free of detectable disease and/or infection. In 
other words, protective immunity may be induced to prevent or treat tuberculosis. 
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In this aspect, the polypeptide, fusion protein or DNA molecule is 
generally present within a pharmaceutical composition and/or a vaccine. 
Pharmaceutical compositions may comprise one or more polypeptides, each of which 
may contain one or more of the above sequences (or variants thereof), and a 
5 physiologically acceptable carrier. Vaccines may comprise one or more of the above 
polypeptides and a non-specific immune response enhancer, such as an adjuvant or a 
liposome (into which the polypeptide is incorporated). Such pharmaceutical 
compositions and vaccines may also contain other M. tuberculosis antigens, either 
incorporated into a combination polypeptide or present within a separate polypeptide. 
10 Alternatively, a vaccine may contain DNA encoding one or more 

polypeptides as described above, such that the polypeptide is generated in situ. In such 
vaccines, the DNA may be present within any of a variety of delivery systems known to 
those of ordinary skill in the art, including nucleic acid expression systems, bacterial 
and viral expression systems. Appropriate nucleic acid expression systems contain the 
1 5 necessary DNA sequences for expression in the patient (such as a suitable promoter and 
terminating signal). Bacterial delivery systems involve the administration of a 
bacterium (such as Bacillus-Calmette-Guerrin) that expresses an immunogenic portion 
of the polypeptide on its cell surface. In a preferred embodiment, the DNA may be 
introduced using a viral expression system (e.g., vaccinia or other pox virus, retrovirus, 
20 or adenovirus), which may involve the use of a non-pathogenic (defective), replication 
competent virus. Techniques for incorporating DNA into such expression systems are 
well known to those of ordinary skill in the art. The DNA may also be "naked," as 
described, for example, in Ulmer etal., Science 259:1745-1749, 1993 and reviewed by 
Cohen, Science 259:1691-1692, 1993. The uptake of naked DNA may be increased by 
25 coating the DNA onto biodegradable beads, which are efficiently transported into the 
cells. 

In a related aspect, a DNA vaccine as described above may be 
administered simultaneously with or sequentially to either a polypeptide of the present 
invention or a known M. tuberculosis antigen, such as the 38 kD antigen described 
30 above. For example, administration of DNA encoding a polypeptide of the present 
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invention, either "naked" or in a delivery system as described above, may be followed 
by administration of an antigen in order to enhance the protective immune effect of the 
vaccine. 

Routes and frequency of administration, as well as dosage, will vary 
5 from individual to individual and may parallel those currently being used in 
immunization using BCG. In general, the pharmaceutical compositions and vaccines 
may be administered by injection (e.g., intracutaneous, intramuscular, intravenous or 
subcutaneous), intranasally (e.g., by aspiration) or orally. Between 1 and 3 doses may 
be administered for a 1-36 week period. Preferably, 3 doses are administered, at 

10 intervals of 3-4 months, and booster vaccinations may be given periodically thereafter. 
Alternate protocols may be appropriate for individual patients. A suitable dose is an 
amount of polypeptide or DNA that, when administered as described above, is capable 
of raising an immune response in an immunized patient sufficient to protect the patient 
from M. tuberculosis infection for at least 1-2 years. In general, the amount of 

1 5 polypeptide present in a dose (or produced in situ by the DNA in a dose) ranges from 
about 1 pg to about 100 mg per kg of host, typically from about 10 pg to about 1 mg, 
and preferably from about 100 pg to about 1 |ag. Suitable dose sizes will vary with the 
size of the patient, but will typically range from about 0.1 itiL to about 5 mL. 

While any suitable carrier known to those of ordinary skill in the art may 

20 be employed in the pharmaceutical compositions of this invention, the type of carrier 
will vary depending on the mode of administration. For parenteral administration, such 
as subcutaneous injection, the carrier preferably comprises water, saline, alcohol, a fat, a 
wax or a buffer. For oral administration, any of the above carriers or a solid carrier, 
such as mannitol, lactose, starch, magnesium stearate, sodium saccharine, talcum, 

25 cellulose, glucose, sucrose, and magnesium carbonate, may be employed. 
Biodegradable microspheres (e.g., polylactic galactide) may also be employed as 
carriers for the pharmaceutical compositions of this invention. Suitable biodegradable 
microspheres are disclosed, for example, in U.S. Patent Nos. 4,897,268 and 5,075,109. 

Any of a variety of adjuvants may be employed in the vaccines of this 

30 invention to nonspecifically enhance the immune response. Most adjuvants contain a 
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substance designed to protect the antigen from rapid catabolism, such as aluminum 
hydroxide or mineral oil, and a nonspecific stimulator of immune responses, such as 
lipid A, Bortadella pertussis or Mycobacterium tuberculosis. Suitable adjuvants are 
commercially available as, for example, Freund's Incomplete Adjuvant and Freund's 
5 Complete Adjuvant (Difco Laboratories) and Merck Adjuvant 65 (Merck and 
Company, Inc., Rahway, NJ). Other suitable adjuvants include alum, biodegradable 
microspheres, monophosphoryl lipid A and quil A. 

In another aspect, this invention provides methods for using one or more 
of the polypeptides described above to diagnose tuberculosis using a skin test. As used 

10 herein, a "skin test" is any assay performed directly on a patient in which a delayed-type 
hypersensitivity (DTH) reaction (such as swelling, reddening or dermatitis) is measured 
following intradermal injection of one or more polypeptides as described above. Such 
injection may be achieved using any suitable device sufficient to contact the 
polypeptide or polypeptides with dermal cells of the patient, such as a tuberculin 

1 5 syringe or 1 mL syringe. Preferably, the reaction is measured at least 48 hours after 
injection, more preferably 48-72 hours. 

The DTH reaction is a cell-mediated immune response, which is greater 
in patients that have been exposed previously to the test antigen (i.e. 9 the immunogenic 
portion of the polypeptide employed, or a variant thereof). The response may be 

20 measured visually, using a ruler. In general, a response that is greater than about 0.5 cm 
in diameter, preferably greater than about 1.0 cm in diameter, is a positive response, 
indicative of tuberculosis infection, which may or may not be manifested as an active 
disease. 

The polypeptides of this invention are preferably formulated, for use in a 
25 skin test, as pharmaceutical compositions containing a polypeptide and a 
physiologically acceptable carrier, as described above. Such compositions typically 
contain one or more of the above polypeptides in an amount ranging from about 1 \ig to 
about 100 jig, preferably from about 10 jig to about 50 n-g in a volume of 0.1 mL. 
Preferably, the carrier employed in such pharmaceutical compositions is a saline 
30 solution with appropriate preservatives, such as phenol and/or Tween 80™. 
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In a preferred embodiment, a polypeptide employed in a skin test is of 
sufficient size such that it remains at the site of injection for the duration of the reaction 
period. In general, a polypeptide that is at least 9 amino acids in length is sufficient. 
The polypeptide is also preferably broken down by macrophages within hours of 
5 injection to allow presentation to T-cells. Such polypeptides may contain repeats of one 
or more of the above sequences and/or other immunogenic or nonimmunogenic 
sequences. 



The following Examples are offered by way of illustration and not by 
1 0 way of limitation. 



EXAMPLES 
EXAMPLE 1 

15 Purification and Characterization of Polypeptides 

from m. tuberculosis culture filtrate 

This example illustrates the preparation of M tuberculosis soluble 
polypeptides from culture filtrate. Unless otherwise noted, all percentages in the 
20 following example are weight per volume. 

M tuberculosis (either H37Ra, ATCC No. 25177, or H37Rv, ATCC 
No. 25618) was cultured in sterile GAS media at 37°C for fourteen days. The media 
was then vacuum filtered (leaving the bulk of the cells) through a 0.45 \x filter into a 
sterile 2.5 L bottle. The media was next filtered through a 0.2 fx filter into a sterile 4 L 
25 bottle and NaN 3 was added to the culture filtrate to a concentration of 0.04%. The 
bottles were then placed in a 4°C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L 
reservoir that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell 
which had been rinsed with ethanol and contained a 10,000 kDa MWCO membrane. 
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The pressure was maintained at 60 psi using nitrogen gas. This procedure reduced the 
12 L volume to approximately 50 ml. 

The culture filtrate was dialyzed into 0.1% ammonium bicarbonate using 
a 8,000 kDa MWCO cellulose ester membrane, with two changes of ammonium 
5 bicarbonate solution. Protein concentration was then determined by a commercially 
available BCA assay (Pierce, Rockford, IL). 

The dialyzed culture filtrate was then lyophilized, and the polypeptides 
resuspended in distilled water. The polypeptides were dialyzed against 0.01 mM 1,3 
bis[tris(hydroxymethyl)-methylamino]propane, pH 7.5 (Bis-Tris propane buffer), the 
10 initial conditions for anion exchange chromatography. Fractionation was performed 
using gel profusion chromatography on a POROS 146 II Q/M anion exchange column 
4.6 mm x 100 mm (Perseptive BioSystems, Framingham, MA) equilibrated in 0.01 mM 
Bis-Tris propane buffer pH 7.5. Polypeptides were eluted with a linear 0-0.5 M NaCl 
gradient in the above buffer system. The column eluent was monitored at a wavelength 
15 of220nm. 

The pools of polypeptides eluting from the ion exchange column were 
dialyzed against distilled water and lyophilized. The resulting material was dissolved in 
0.1% trifluoroacetic acid (TFA) pH 1 .9 in water, and the polypeptides were purified on 
a Delta-Pak CI 8 column (Waters, Milford, MA) 300 Angstrom pore size, 5 micron 
20 particle size (3.9 x 150 mm). The polypeptides were eluted from the column with a 
linear gradient from 0-60% dilution buffer (0.1% TFA in acetonitrile). The flow rate 
was 0.75 ml/minute and the HPLC eluent was monitored at 214 nm. Fractions 
containing the eluted polypeptides were collected to maximize the purity of the 
individual samples. Approximately 200 purified polypeptides were obtained. 

25 The purified polypeptides were then screened for the ability to induce T- 

cell proliferation in PBMC preparations. The PBMCs from donors known to be PPD 
skin test positive and whose T-cells were shown to proliferate in response to PPD and 
crude soluble proteins from MTB were cultured in medium comprising RPMI 1640 
supplemented with 10% pooled human serum and 50 ug/ml gentamicin. Purified 

30 polypeptides were added in duplicate at concentrations of 0.5 to 10 ug/mL. After six 
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days of culture in 96-well round-bottom plates in a volume of 200 \H 9 50 p.1 of medium 
was removed from each well for determination of IFN-y levels, as described below. 
The plates were then pulsed with 1 |j.Ci/well of tritiated thymidine for a further 18 
hours, harvested and tritium uptake determined using a gas scintillation counter. 
5 Fractions that resulted in proliferation in both replicates three fold greater than the 
proliferation observed in cells cultured in medium alone were considered positive. 

IFN-y was measured using an enzyme-linked immunosorbent assay 
(ELISA). ELISA plates were coated with a mouse monoclonal antibody directed to 
human IFN-y (PharMingen, San Diego, CA) in PBS for four hours at room temperature. 

10 Wells were then blocked with PBS containing 5% (W/V) non-fat dried milk for 1 hour 
at room temperature. The plates were then washed six times in PBS/0.2% TWEEN-20 
and samples diluted 1:2 in culture medium in the ELISA plates were incubated 
overnight at room temperature. The plates were again washed and a polyclonal rabbit 
anti-human IFN-y serum diluted 1:3000 in PBS/10% normal goat serum was added to 

15 each well. The plates were then incubated for two hours at room temperature, washed 
and horseradish peroxidase-coupled anti-rabbit IgG (Sigma Chemical So., St. Louis, 
MO) was added at a 1 :2000 dilution in PBS/5% non-fat dried milk. After a further two 
hour incubation at room temperature, the plates were washed and TMB substrate added. 
The reaction was stopped after 20 min with 1 N sulfuric acid. Optical density was 

20 determined at 450 nm using 570 nm as a reference wavelength. Fractions that resulted 
in both replicates giving an OD two fold greater than the mean OD from cells cultured 
in medium alone, plus 3 standard deviations, were considered positive. 

For sequencing, the polypeptides were individually dried onto 
Biobrene™ (Perkin Elmer/Applied BioSystems Division, Foster City, CA) treated glass 

25 fiber filters. The filters with polypeptide were loaded onto a Perkin Elmer/Applied 
BioSystems Division Procise 492 protein sequencer. The polypeptides were sequenced 
from the amino terminal and using traditional Edman chemistry. The amino acid 
sequence was determined for each polypeptide by comparing the retention time of the 
PTH amino acid derivative to the appropriate PTH derivative standards. 
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Using the procedure described above, antigens having the following 

N-terminal sequences were isolated: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Xaa-Asn-Tyr-Gly- 
Gln-Val-Val- Ala-Ala-Leu; (SEQ ID No. 54) 
5 (b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 

Ser; (SEQ ID No. 55) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 

Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 56) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 

10 Pro; (SEQ ID No. 57) 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; 

(SEQ ID No. 58) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID 
No. 59) 

15 ( g ) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Ala-Ala-Ala-AIa- 

Pro-Pro-Ala; (SEQ ID No. 60) and 
(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 

Gly; (SEQ ID No. 61) 
wherein Xaa may be any amino acid. 

20 An additional antigen was isolated employing a microbore HPLC 

purification step in addition to the procedure described above. Specifically, 20 ul of a 
fraction comprising a mixture of antigens from the chromatographic purification step 
previously described, was purified on an Aquapore CI 8 column (Perkin Elmer/ Applied 
Biosystems Division, Foster City, CA) with a 7 micron pore size, column size 1 mm x 

25 100 mm, in a Perkin Elmer/ Applied Biosystems Division Model 172 HPLC. Fractions 
were eluted from the column with a linear gradient of 1%/minute of acetonitrile 
(containing 0.05% TFA) in water (0.05% TFA) at a flow rate of 80 ul/minute. The 
eluent was monitored at 250 nm. The original fraction was separated into 4 major peaks 
plus other smaller components and a polypeptide was obtained which was shown to 
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have a molecular weight of 12.054 Kd (by mass spectrometry) and the following N- 
terminal sequence: 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Gln- 
Thr-Ser-Leu-Leu-Asn-Asn-Leu-Ala-Asp-Pro-Asp-Val-Ser-Phe- 
5 Ala- Asp (SEQ ID No. 62). 

This polypeptide was shown to induce proliferation and IFN-y production in PBMC 
preparations using the assays described above. 

Additional soluble antigens were isolated from M. tuberculosis culture 
filtrate as follows. M. tuberculosis culture filtrate was prepared as described above. 

10 Following dialysis against Bis-Tris propane buffer, at pH 5.5, fractionation was 
performed using anion exchange chromatography on a Poros QE column 4.6 x 100 mm 
(Perseptive Biosy stems) equilibrated in Bis-Tris propane buffer pH 5.5. Polypeptides 
were eluted with a linear 0-1.5 M NaCl gradient in the above buffer system at a flow 
rate of 10 ml/min. The column eluent was monitored at a wavelength of 214 nm. 

15 The fractions eluting from the ion exchange column were pooled and 

subjected to reverse phase chromatography using a Poros R2 column 4.6 x 100 mm 
(Perseptive Biosystems). Polypeptides were eluted from the column with a linear 
gradient from 0-100% acetonitrile (0.1% TFA) at a flow rate of 5 ml/min. The eluent 
was monitored at 214 nm. 

20 Fractions containing the eluted polypeptides were lyophilized and 

resuspended in 80 \il of aqueous 0.1% TFA and further subjected to reverse phase 
chromatography on a Vydac C4 column 4.6 x 150 mm (Western Analytical, Temecula, 
CA) with a linear gradient of 0-100% acetonitrile (0.1% TFA) at a flow rate of 2 
ml/min. Eluent was monitored at 214 nm. 

25 The fraction with biological activity was separated into one major peak 

plus other smaller components. Western blot of this peak onto PVDF membrane 
revealed three major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These 
polypeptides were determined to have the following N-terminal sequences, respectively: 
(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 

30 Ser; (SEQ ID No/ 134) 
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(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 

Asp; (SEQ ID No. 135) and 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQ ID No. 136), wherein Xaa may be any amino acid. 
5 Using the assays described above, these polypeptides were shown to induce 
proliferation and IFN-y production in PBMC preparations. Figs. 1A and B show the 
results of such assays using PBMC preparations from a first and a second donor, 
respectively. 

DNA sequences that encode the antigens designated as (a), (c), (d) and 

10 (g) above were obtained by screening a genomic M. tuberculosis library using 32 P end 
labeled degenerate oligonucleotides corresponding to the N-terminal sequence and 
containing M. tuberculosis codon bias. The screen performed using a probe 
corresponding to antigen (a) above identified a clone having the sequence provided in 
SEQ ID No. 101. The polypeptide encoded by SEQ ID No. 101 is provided in SEQ ID 

15 No. 102. The screen performed using a probe corresponding to antigen (g) above 
identified a clone having the sequence provided in SEQ ID No. 52. The polypeptide 
encoded by SEQ ID No. 52 is provided in SEQ ID No. 53. The screen performed 
using a probe corresponding to antigen (d) above identified a clone having the sequence 
provided in SEQ ID No. 24, and the screen performed with a probe corresponding to 

20 antigen (c) identified a clone having the sequence provided in SEQ ID No: 25. 

The above amino acid sequences were compared to known amino acid 
sequences in the gene bank using the DNA STAR system. The database searched 
contains some 173,000 proteins and is a combination of the Swiss, PIR databases along 
with translated protein sequences (Version 87). No significant homologies to the amino 

25 acid sequences for antigens (a)-(h) and (1) were detected. 

The amino acid sequence for antigen (i) was found to be homologous to 
a sequence from M leprae. The full length M. leprae sequence was amplified from 
genomic DNA using the sequence obtained from GENBANK. This sequence was then 
used to screen the M. tuberculosis library described below in Example 2 and a full 

30 length copy of the M tuberculosis homologue was obtained (SEQ ID No. 99). 
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The amino acid sequence for antigen (j) was found to be homologous to 
a known M. tuberculosis protein translated from a DNA sequence. To the best of the 
inventors' knowledge, this protein has not been previously shown to possess T-cell 
stimulatory activity. The amino acid sequence for antigen (k) was found to be related to 
5 a sequence from M. leprae. 

In the proliferation and IFN-y assays described above, using three PPD 
positive donors, the results for representative antigens provided above are presented in 
Table 1 : 



10 TABLE 1 

Results of PBMC Proliferation and IFN-y Assays 



Sequence 


Proliferation 


IFN-y 


(a) 


+ 




(c) 


+++ 


+++ 


(d) 


++ 


++ 


(g) 


+++ 


+++ 


(h) 


+++ 


+++ 



In Table 1, responses that gave a stimulation index (SI) of between 2 and 
15 4 (compared to cells cultured in medium alone) were scored as +, an SI of 4-8 or 2-4 at 
a concentration of 1 jag or less was scored as ++ and an SI of greater than 8 was scored 
as +++. The antigen of sequence (i) was found to have a high SI (+++) for one donor 
and lower SI (++ and +) for the two other donors in both proliferation and IFN-y assays. 
These results indicate that these antigens are capable of inducing proliferation and/or 
20 interferon-y production. 
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EXAMPLE 2 

Use of Patient Sera to Isolate M. Tuberculosis Antigens 

This example illustrates the isolation of antigens from M. tuberculosis 
5 lysate by screening with serum from M. tuberculosis-infected individuals. 

Dessicated M. tuberculosis H37Ra (Difco Laboratories) was added to a 
2% NP40 solution, and alternately homogenized and sonicated three times. The 
resulting suspension was centriftiged at 13,000 rpm in microfuge tubes and the 
supernatant put through a 0.2 micron syringe filter. The filtrate was bound to Macro 
10 Prep DEAE beads (BioRad, Hercules, CA). The beads were extensively washed with 
20 mM Tris pH 7.5 and bound proteins eluted with 1M NaCl. The 1M NaCl elute was 
dialyzed overnight against 10 mM Tris, pH 7.5. Dialyzed solution was treated with 
DNase and RNase at 0.05 mg/ml for 30 min. at room temperature and then with a-D- 
mannosidase, 0.5 U/mg at pH 4.5 for 3-4 hours at room temperature. After returning to 
15 pH 7.5, the material was fractionated via FPLC over a Bio Scale-Q-20 column 
(BioRad). Fractions were combined into nine pools, concentrated in a Centriprep 10 
(Amicon, Beverley, MA) and then screened by Western blot for serological activity 
using a serum pool from M. tuberculosis-infected patients which was not 
immunoreactive with other antigens of the present invention. 
20 The most reactive fraction was run in SDS-PAGE and transferred to 

PVDF. A band at approximately 85 Kd was cut out yielding the sequence: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID No. 137), wherein Xaa may 
be any amino acid. 

25 Comparison of this sequence with those in the gene bank as described 

above, revealed no significant homologies to known sequences. 

A DNA sequence that encodes the antigen designated as (m) above was 
obtained by screening a genomic M. tuberculosis Erdman strain library using labeled 
degenerate oligonucleotides corresponding to the N-terminal sequence of SEQ ID 

30 NO: 13 7. A clone was identified having the DNA sequence provided in SEQ ID NO: 
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203. This sequence was found to encode the amino acid sequence provided in SEQ ID 
NO: 204. Comparison of these sequences with those in the genebank revealed some 
similarity to sequences previously identified in M. tuberculosis and M bovis. 

5 EXAMPLE 3 

Preparation of DNA Sequences Encoding M tuberculosis Antigens 

This example illustrates the preparation of DNA sequences encoding 
M. tuberculosis antigens by screening a M tuberculosis expression library with sera 
10 obtained from patients infected with M. tuberculosis , or with anti-sera raised against 
soluble M. tuberculosis antigens. 

A. Preparation of M. tuberculosis Soluble Antigens using Rabbit Anti- 
sera RAISED AGAINST M. TUBERCULOSIS SUPERNATANT 

15 Genomic DNA was isolated from the M. tuberculosis strain H37Ra. The 

DNA was randomly sheared and used to construct an expression library using the 
Lambda ZAP expression system (Stratagene, La Jolla, CA). Rabbit anti-sera was 
generated against secretory proteins of the M. tuberculosis strains H37Ra, H37Rv and 
Erdman by immunizing a rabbit with concentrated supernatant of the M tuberculosis 

20 cultures. Specifically, the rabbit was first immunized subcutaneously with 200 jag of 
protein antigen in a total volume of 2 ml containing lOjig muramyl dipeptide 
(Calbiochem, La Jolla, CA) and 1 ml of incomplete Freund's adjuvant. Four weeks later 
the rabbit was boosted subcutaneously with 1 00 p.g antigen in incomplete Freund's 
adjuvant. Finally, the rabbit was immunized intravenously four weeks later with 50 jag 

25 protein antigen. The anti-sera were used to screen the expression library as described in 
Sambrook et aL, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratories, Cold Spring Harbor, NY, 1989. Bacteriophage plaques expressing 
immunoreactive antigens were purified. Phagemid from the plaques was rescued and 
the nucleotide sequences of the M. tuberculosis clones deduced. 

30 Thirty two clones were purified. Of these, 25 represent sequences that 

have not been previously identified in human M. tuberculosis. Recombinant antigens 
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were expressed and purified antigens used in the immunological analysis described in 
Example 1. Proteins were induced by IPTG and purified by gel elution, as described in 
Skeiky etal., J. Exp. Med. 181: 1527-1 537, 1995. Representative sequences of DNA 
molecules identified in this screen are provided in SEQ ID Nos.: 1-25. The 
5 corresponding predicted amino acid sequences are shown in SEQ ID Nos. 63-87. 

On comparison of these sequences with known sequences in the gene 
bank using the databases described above, it was found that the clones referred to 
hereinafter as TbRA2A, TbRA16, TbRAlS, and TbRA29 (SEQ ID Nos. 76, 68, 70, 75) 
show some homology to sequences previously identified in Mycobacterium leprae but 
10 not in M. tuberculosis. TbRAl 1, TbRA26, TbRA28 and TbDPEP (SEQ ID Nos.: 65, 
73, 74, 53) have been previously identified in M. tuberculosis. No significant 
homologies were found to TbRAl, TbRA3, TbRA4, TbRA9, TbRAl 0, TbRA13, 
TbRAl 7, TbRal9, TbRA29, TbRA32, TbRA36 and the overlapping clones TbRA35 
and TbRAl 2 (SEQ ID Nos. 63, 77, 81, 82, 64, 67, 69, 71, 75, 78, 80, 79, 66). The 
1 5 clone TbRa24 is overlapping with clone TbRa29. 

The results of PBMC proliferation and interferon-y assays performed on 
representative recombinant antigens, and using T-cell preparations from several 
different M. tuberculosis-immune patients, are presented in Tables 2 and 3, 
respectively. 
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In Tables 2 and 3, responses that gave a stimulation index (SI) of 
between 1.2 and 2 (compared to cells cultured in medium alone) were scored as ±, a SI 
of 2-4 was scored as +, as SI of 4-8 or 2-4 at a concentration of 1 jj.g or less was scored 
as ++ and an SI of greater than 8 was scored as +++. In addition, the effect of 
5 concentration on proliferation and interferon-y production is shown for two of the above 
antigens in the attached Figure. For both proliferation and interferon-y production, 
TbRa3 was scored as ++ and TbRa9 as +. 

These results indicate that these soluble antigens can induce proliferation 
and/or interferon-y production in T-cells derived from an M. tuberculosis-immune 
10 individual. 

B. Use of Sera From Patients having Pulmonary or Pleural Tuberculosis 
to Identify DNA Sequences Encoding M tuberculosis Antigens 

The genomic DNA library described above, and an additional H37Rv 

library, were screened using pools of sera obtained from patients with active 

tuberculosis. To prepare the H37Rv library, M. tuberculosis strain H37Rv genomic 

DNA was isolated, subjected to partial Sau3A digestion and used to construct an 

expression library using the Lambda Zap expression system (Stratagene, La Jolla, Ca). 

Three different pools of sera, each containing sera obtained from three individuals with 

active pulmonary or pleural disease, were used in the expression screening. The pools 

were designated TbL, TbM and TbH, referring to relative reactivity with H37Ra lysate 

(i.e., TbL = low reactivity, TbM = medium reactivity and TbH = high reactivity) in both 

ELISA and immunoblot format. A fourth pool of sera from seven patients with active 

pulmonary tuberculosis was also employed. All of the sera lacked increased reactivity 

with the recombinant 38 kD M tuberculosis H37Ra phosphate-binding protein. 

All pools were pre-adsorbed with E. coli lysate and used to screen the 

H37Ra and H37Rv expression libraries, as described in Sambrook et al., Molecular 

Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, 

NY, 1989. Bacteriophage plaques expressing immunoreactive antigens were purified. 

Phagemid from the plaques was rescued and the nucleotide sequences of the 

M. tuberculosis clones deduced. 
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Thirty two clones were purified. Of these, 3 1 represented sequences that 
had not been previously identified in human M tuberculosis. Representative sequences 
of the DNA molecules identified are provided in SEQ ID Nos.: 26-51 and 105. Of 
these, TbH-8-2 (SEQ. ID NO. 105) is a partial clone of TbH-8, and TbH-4 (SEQ. ID 
5 NO. 43) and TbH-4-FWD (SEQ. ID NO. 44) are non-contiguous sequences from the 
same clone. Amino acid sequences for the antigens hereinafter identified as Tb38-1, 
TbH-4, TbH-8, TbH-9, and TbH-12 are shown in SEQ ID Nos.: 88-92. Comparison of 
these sequences with known sequences in the gene bank using the databases identified 
above revealed no significant homologies to TbH-4, TbH-8, TbH-9 and TbM-3, 

1 0 although weak homologies were found to TbH-9. TbH-12 was found to be homologous 
to a 34 kD antigenic protein previously identified in M. paratuberculosis (Acc. 
No. S28515). Tb38-1 was found to be located 34 base pairs upstream of the open 
reading frame for the antigen ESAT-6 previously identified in M bovis (Acc. 
No. U34848) and in M tuberculosis (Sorensen etal., Infec. Immun. 65:1710-1717, 

15 1995). 

Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra 
library, were used to identify clones in an H37Rv library. Tb38-1 hybridized to 
Tb38-1F2, Tb38-lF3,Tb38-lF5 and Tb38-1F6 (SEQ. ID NOS. 112, 113, 116, 11 8, and 
119). (SEQ ID NOS. 112 and 113 are non-contiguous sequences from clone Tb38- 

20 1F2.) Two open reading frames were deduced in Tb38-IF2; one corresponds to Tb37FL 
(SEQ. ID. NO. 114), the second, a partial sequence, may be the homologue of Tb38-1 
and is called Tb38-IN (SEQ. ID NO. 115). The deduced amino acid sequence of Tb38- 
1F3 is presented in SEQ. ID. NO. 117. A TbH-9 probe identified three clones in the 
H37Rv library: TbH-9-FL (SEQ. ID NO. 106), which may be the homologue of TbH-9 

25 (R37Ra), TbH-9- 1 (SEQ. ID NO. 108), and TbH-9-4 (SEQ. ID NO. 1 10), all of which 
are highly related sequences to TbH-9. The deduced amino acid sequences for these 
three clones are presented in SEQ ID NOS. 107, 109 and 1 1 1. 

Further screening of the M tuberculosis genomic DNA library, as 
described above, resulted in the recovery of ten additional reactive clones, representing 

30 seven different genes. One of these genes was identified as the 38 Kd antigen discussed 
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above, one was determined to be identical to the 14Kd alpha crystallin heat shock 
protein previously shown to be present in M. tuberculosis, and a third was determined 
to be identical to the antigen TbH-8 described above. The determined DNA sequences 
for the remaining five clones (hereinafter referred to as TbH-29, TbH-30, TbH-32 and 
5 TbH-33) are provided in SEQ ID NO: 138-141, respectively, with the corresponding 
predicted amino acid sequences being provided in SEQ ID NO: 142-145, respectively. 
The DNA and amino acid sequences for these antigens were compared with those in the 
gene bank as described above. No homologies were found to the 5' end of TbH-29 
(which contains the reactive open reading frame), although the 3' end of TbH-29 was 

10 found to be identical to the M tuberculosis cosmid Y227. TbH-32 and TbH-33 were 
found to be identical to the previously identified M. tuberculosis insertion element 
IS61 10 and to the M. tuberculosis cosmid Y50, respectively. No significant homologies 
to TbH-30 were found. 

Positive phagemid from this additional screening were used to infect E. 

1 5 coli XL-1 Blue MRF 1 , as described in Sambrook et al., supra. Induction of recombinant 
protein was accomplished by the addition of IPTG. Induced and uninduced lysates 
were run in duplicate on SDS-PAGE and transferred to nitrocellulose filters. Filters 
were reacted with human M. tuberculosis sera (1:200 dilution) reactive with TbH and a 
rabbit sera (1:200 or 1:250 dilution) reactive with the N-terminal 4 Kd portion of lacZ. 

20 Sera incubations were performed for 2 hours at room temperature. Bound antibody was 
detected by addition of ,25 I-labeled Protein A and subsequent exposure to film for 
variable times ranging from 16 hours to 1 1 days. The results of the immunoblots are 
summarized in Table 4. 
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TABLE 4 



10 



Human M. tb Anti-lacZ 
Antigen Sera Sera 

TbH-29 45 Kd 45 Kd 

TbH-30 No reactivity 29 Kd 

TbH-32 12Kd 12 Kd 

TbH-33 16Kd 16Kd 



Positive reaction of the recombinant human M tuberculosis antigens 
with both the human M. tuberculosis sera and anti-lacZ sera indicate that reactivity of 
the human M. tuberculosis sera is directed towards the fusion protein. Antigens 
reactive with the anti-lacZ sera but not with the human M tuberculosis sera may be the 
15 result of the human M. tuberculosis sera recognizing conformational epitopes, or the 
antigen-antibody binding kinetics may be such that the 2 hour sera exposure in the 
immunoblot is not sufficient. 



The results of T-cell assays performed on Tb38-1, ESAT-6 and other 
20 representative recombinant antigens are presented in Tables 5 A, B and 6, respectively, 
below: 



TABLE 5A 

Results of PBMC Proliferation to Representative Antigens 

25 



Antigen 


Donor 




1 


2 


3 
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10 


11 


Tb38.1 


-H-+ 


+ 
















++ 


+++ 
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+++ 
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+ 




+ 
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++ 
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++ 
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++ 


-M- 


++ 




++ 
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TABLE 5B 

Results of PBMC Interferon-v Production to Representative antigens 



Antigen 






Donor 




1 


2 


3 


4 


5 


6 


7 


8 


9 


!0 


11 


Tb38.1 


+-H 


+ 
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•f 


+++ 




++ 




+++ 


+++ 
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+++ 


+ 






+- 


+ 
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+++ 


+-*-+ 


TbH-9 


++ 


++ 




+++ 
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-M-f 




-+•+ 




++ 



5 

TABLE 6 

Summary of T-cell Responses to Representative Antigens 



Antigen 


Proliferation 


Interferon -y 


total 


patient 4 


patient 5 


patient 6 


patient 4 


patient 5 


patient 6 
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-in- 


++ 
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++ 


++ 


13 
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++ 
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++ 
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+ 


± 


++ 


-H- 


+ 


7.5 


TbH4 
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0 



10 These results indicate that both the inventive M. tuberculosis antigens 

and ESAT-6 can induce proliferation and/or interferon-y production in T-cells derived 
from an M. tuberculosis-immune individual. To the best of the inventors 1 knowledge, 
ESAT-6 has not been previously shown to stimulate human immune responses 

A set of six overlapping peptides covering the amino acid sequence of 

15 the antigen Tb38-1 was constructed using the method described in Example 6. The 
sequences of these peptides, hereinafter referred to as pep 1-6, are provided in SEQ ID 
Nos. 93-98, respectively. The results of T-cell assays using these peptides are shown in 
Tables 7 and 8. These results confirm the existence, and help to localize T-cell epitopes 
within Tb38-1 capable of inducing proliferation and interferon-y production in T-cells 

20 derived from an M. tuberculosis immune individual. 
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Studies were undertaken to determine whether the antigens TbH-9 and Tb38-1 
represent cellular proteins or are secreted into M. tuberculosis culture media. In the first 
study, rabbit sera were raised against A) secretory proteins of M tuberculosis, B) the known 
secretory recombinant M. tuberculosis antigen 85b, C) recombinant Tb38-1 and D) 
5 recombinant TbH-9, using protocols substantially the same as that as described in Example 
3A. Total M tuberculosis lysate, concentrated supernatant of M. tuberculosis cultures and 
the recombinant antigens 85b, TbH-9 and Tb38-1 were resolved on denaturing gels, 
immobilized on nitrocellulose membranes and duplicate blots were probed using the rabbit 
sera described above. 

10 The results of this analysis using control sera (panel I) and antisera (panel II) 

against secretory proteins, recombinant 85b, recombinant Tb38-1 and recombinant TbH-9 are 
shown in Figures 3A-D, respectively, wherein the lane designations are as follows: 1) 
molecular weight protein standards; 2) 5 jxg of M. tuberculosis lysate; 3) 5 jag secretory 
proteins; 4) 50 ng recombinant Tb38-1; 5) 50 ng recombinant TbH-9; and 6) 50 ng 

15 recombinant 85b. The recombinant antigens were engineered with six terminal histidine 
residues and would therefore be expected to migrate with a mobility approximately 1 kD 
larger that the native protein. In Figure 3D, recombinant TbH-9 is lacking approximately 10 
kD of the full-length 42 kD antigen, hence the significant difference in the size of the 
immunoreactive native TbH-9 antigen in the lysate lane (indicated by an arrow). These 

20 results demonstrate that Tb38-1 and TbH-9 are intracellular antigens and are not actively 
secreted by M. tuberculosis. 

The finding that TbH-9 is an intracellular antigen was confirmed by 
determining the reactivity of TbH-9-specific human T cell clones to recombinant TbH-9, 
secretory M. tuberculosis proteins and PPD. A TbH-9-specific T cell clone (designated 

25 131TbH-9) was generated from PBMC of a healthy PPD-positive donor. The proliferative 
response of 131TbH-9 to secretory proteins, recombinant TbH-9 and a control M. 
tuberculosis antigen, TbRal 1, was determined by measuring uptake of tritiated thymidine, as 
described in Example 1. As shown in Figure 4A, the clone 131TbH-9 responds specifically 
to TbH-9, showing that TbH-9 is not a significant component of M. tuberculosis secretory 

30 proteins. Figure 4B shows the production of IFN-y by a second TbH-9-specific T cell clone 
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(designated PPD 800-10) prepared from PBMC from a healthy PPD-positive donor, 
following stimulation of the T cell clone with secretory proteins, PPD or recombinant TbH-9. 
These results further confirm that TbH-9 is not secreted by M. tuberculosis. 

5 C. Use of Sera From Patients having Extrapulmonary Tuberculosis to Identify 
DNA Sequences Encoding M tuberculosis Antigens 

Genomic DNA was isolated from M tuberculosis Erdman strain, randomly 
sheared and used to construct an expression library employing the Lambda ZAP expression 

10 system (Stratagene, La Jolla, CA). The resulting library was screened using pools of sera 
obtained from individuals with extrapulmonary tuberculosis, as described above in Example 
3B 5 with the secondary antibody being goat anti-human IgG + A + M (H+L) conjugated with 
alkaline phosphatase. 

Eighteen clones were purified. Of these, 4 clones (hereinafter referred to as 

15 XP14, XP24, XP31 and XP 3 2) were found to bear some similarity to known sequences. The 
determined DNA sequences for XP14, XP24 and XP31 are provided in SEQ ID Nos.: 156- 
158, respectively, with the 5' and 3' DNA sequences for XP32 being provided in SEQ ID 
Nos.: 159 and 160, respectively. The predicted amino acid sequence for XP14 is provided in 
SEQ ID No: 161. The reverse complement of XP14 was found to encode the amino acid 

20 sequence provided in SEQ ID No.: 162. 

Comparison of the sequences for the remaining 14 clones (hereinafter referred 
to as XP1-XP6, XP17-XP19, XP22, XP25, XP27, XP30 and XP36) with those in the 
genebank as described above, revealed no homologies with the exception of the 3' ends of 
XP2 and XP6 which were found to bear some homology to known M tuberculosis cosmids. 

25 The DNA sequences for XP27 and XP36 are shown in SEQ ID Nos.: 163 and 164, 
respectively, with the 5 5 sequences for XP4, XP5, XP17 and XP30 being shown in SEQ ID 
Nos: 165-168, respectively, and the 5' and 3' sequences for XP2, XP3, XP6, XP18, XP19, 
XP22 and XP25 being shown in SEQ ID Nos: 169 and 170; 171 and 172; 173 and 174; 175 
and 176; 177 and 178; 179 and 180; and 181 and 182, respectively. XP1 was found to 

30 overlap with the DNA sequences for TbH4, disclosed above. The full-length DNA sequence 
for TbH4-XPl is provided in SEQ ID No.: 183. This DNA sequence was found to contain an 
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open reading frame encoding the amino acid sequence shown in SEQ ID No: 184. The 
reverse complement of TbH4-XPl was found to contain an open reading frame encoding the 
amino acid sequence shown in SEQ ID No.: 185. The DNA sequence for XP36 was found to 
contain two open reading frames encoding the amino acid sequence shown in SEQ ID Nos.: 
5 186 and 187, with the reverse complement containing an open reading frame encoding the 
amino acid sequence shown in SEQ ID No.: 188. 

Recombinant XP1 protein was prepared as described above in Example 3B, 
with a metal ion affinity chromatography column being employed for purification. As 
illustrated in Figures 8A-B and 9A-B, using the assays described herein, recombinant XP1 
10 was found to stimulate cell proliferation and IFN-y production in T cells isolated from an M 
tuberculosisAmmune donors. 

D. Preparation of M. tuberculosis Soluble Antigens using Rabbit Anti-sera 

RAISED AGAINST M. TUBERCULOSIS FRACTIONATED PROTEINS 

M. tuberculosis lysate was prepared as described above in Example 2. The 
resulting material was fractionated by HPLC and the fractions screened by Western blot for 
serological activity with a serum pool from M. tuberculosis-infected patients which showed 
little or no immunoreactivity with other antigens of the present invention. Rabbit anti-sera 
was generated against the most reactive fraction using the method described in Example 3 A . 
The anti-sera was used to screen an M. tuberculosis Erdman strain genomic DNA expression 
library prepared as described above. Bacteriophage plaques expressing immunoreactive 
antigens were purified. Phagemid from the plaques was rescued and the nucleotide sequences 
of the M. tuberculosis clones determined. 

Ten different clones were purified. Of these, one was found to be TbRa35, 
described above, and one was found to be the previously identified M tuberculosis antigen, 
HSP60. Of the remaining eight clones, seven (hereinafter referred to as RDIF2, RDIF5, 
RDIF8, RDIF10, RDIF11 and RDIF 12) were found to bear some similarity to previously 
identified M. tuberculosis sequences. The determined DNA sequences for RDIF2, RDIF5, 
RDIF8, RDIF10 and RDIF11 are provided in SEQ ID Nos.: 189-193, respectively, with the 
corresponding predicted amino acid sequences being provided in SEQ ID Nos: 194-198, 
respectively. The 5' and 3' DNA sequences for RDIF12 are provided in SEQ ID Nos.: 199 
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and 200 ? respectively. No significant homologies were found to the antigen RDIF-7. The 
determined DNA and predicted amino acid sequences for RDIF7 are provided in SEQ ID 
Nos.: 201 and 202, respectively. One additional clone, referred to as RDIF6 was isolated, 
however, this was found to be identical to RDIF5. 
5 Recombinant RDIF6, RDIF8, RDIF10 and RDIF11 were prepared as 

described above. As shown in Figures 8A-B and 9A-B, these antigens were found to 
stimulate cell proliferation and IFN-y production in T cells isolated from M. tuberculosis- 
immune donors. 

10 EXAMPLE 4 

Purification and Characterization of a Polypeptide from Tuberculin Purified 

Protein Derivative 

An M. tuberculosis polypeptide was isolated from tuberculin purified protein 
1 5 derivative (PPD) as follows. 

PPD was prepared as published with some modification (Seibert, F. et al., 
Tuberculin purified protein derivative. Preparation and analyses of a large quantity for 
standard. The American Review of Tuberculosis 44 :9-25. 1941). 

M. tuberculosis Rv strain was grown for 6 weeks in synthetic medium in roller 
20 bottles at 37°C. Bottles containing the bacterial growth were then heated to 100° C in water 
vapor for 3 hours. Cultures were sterile filtered using a 0.22 \x filter and the liquid phase was 
concentrated 20 times using a 3 kD cut-off membrane. Proteins were precipitated once with 
50% ammonium sulfate solution and eight times with 25% ammonium sulfate solution. The 
resulting proteins (PPD) were fractionated by reverse phase liquid chromatography (RP- 
25 HPLC) using a CI 8 column (7.8 x 300 mM; Waters, Milford, MA) in a Biocad HPLC system 
(Perseptive Biosystems, Framingham, MA). Fractions were eluted from the column with a 
linear gradient from 0-100% buffer (0.1% TFA in acetonitrile). The flow rate was 10 
ml/minute and eluent was monitored at 214 nm and 280 nm. 

Six fractions were collected, dried, suspended in PBS and tested individually 
30 in M. tuberculosis-infected guinea pigs for induction of delayed type hypersensitivity (DTH) 
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reaction. One fraction was found to induce a strong DTH reaction and was subsequently 
fractionated further by RP-HPLC on a microbore Vydac CI 8 column (Cat. No. 218TP51 15) 
in a Perkin Elmer/Applied Biosystems Division Model 172 HPLC. Fractions were eluted 
with a linear gradient from 5-100% buffer (0.05% TFA in acetonitrile) with a flow rate of 80 
5 (al/minute. Eluent was monitored at 215 nm. Eight fractions were collected and tested for 
induction of DTH in M tuberculosis-infected guinea pigs. One fraction was found to induce 
strong DTH of about 16 mm induration. The other fractions did not induce detectable DTH. 
The positive fraction was submitted to SDS-PAGE gel electrophoresis and found to contain a 
single protein band of approximately 12 kD molecular weight. 

10 This polypeptide, herein after referred to as DPPD, was sequenced from the 

amino terminal using a Perkin Elmer/Applied Biosystems Division Procise 492 protein 
sequencer as described above and found to have the N-terminal sequence shown in SEQ ID 
No.: 129. Comparison of this sequence with known sequences in the gene bank as described 
above revealed no known homologies. Four cyanogen bromide fragments of DPPD were 

15 isolated and found to have the sequences shown in SEQ ID Nos.: 130-133. 

The ability of the antigen DPPD to stimulate human PBMC to proliferate and 
to produce IFN-y was assayed as described in Example 1 . As shown in Table 9, DPPD was 
found to stimulate proliferation and elicit production of large quantities of IFN-y; more than 
that elicited by commercial PPD. 

20 
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TABLE 9 

Results of Proliferation and Interferon-? Assays to DPPD 



PBMC Donor 


Stimulator 


Proliferation (CPM) 


IFN-y(OD 450 ) 


A 


Medium 


1,089 


0.17 




PPD (commercial) 


8,394 


1.29 




DPPD 


13,451 


2.21 










B 


Medium 


450 


0.09 




PPD (commercial) 


3,929 


1.26 




DPPD 


6,184 


1.49 










C 


Medium 


541 


0.11 




PPD (commercial) 


8,907 


0.76 




DPPD 


23,024 


>2.70 



5 

EXAMPLE 5 

USE OF REPRESENTATIVE ANTIGENS FOR DIAGNOSIS OF TUBERCULOSIS 

This example illustrates the effectiveness of several representative 
10 polypeptides in skin tests for the diagnosis of M. tuberculosis infection. 

Individuals were injected intradermally with 100 \x\ of either PBS or PBS plus 
Tween 20™ containing either 0.1 jag of protein (for TbH-9 and TbRa35) or 1.0 \xg of protein 
(for TbRa38-l). Induration was measured between 5-7 days after injection, with a response 
of 5 mm or greater being considered positive. Of the 20 individuals tested, 2 were PPD 
15 negative and 18 were PPD positive. Of the PPD positive individuals, 3 had active 
tuberculosis, 3 had been previously infected with tuberculosis and 9 were healthy. In a 
second study, 13 PPD positive individuals were tested with 0.1 ^ig TbRal 1 in either PBS or 
PBS plus Tween 20™ as described above. The results of both studies are shown in Table 10. 
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TABLE 10 

RESULTS OF DTH TESTING WITH REPRESENTATIVE ANTIGENS 





TbH-9 
Pos/Total 


Tb38-1 
Pos/Total 


TbRa35 
Pos/Total 


Cumulative 
Pos/Total 


TbRall 
Pos/Total 


PPD negative 


0/2 


0/2 


0/2 


0/2 
















PPD positive 












healthy 


5/9 


4/9 


4/9 


6/9 


1/4 


prior TB 


3/5 


2/5 


2/5 


4/5 


3/5 


active 


3/4 


3/4 


0/4 


4/4 


1/4 


TOTAL 


11/18 


9/18 


6/18 


14/18 


5/13 



5 



EXAMPLE 6 
Synthesis of Synthetic Polypeptides 

10 Polypeptides may be synthesized on a Millipore 9050 peptide synthesizer 

using FMOC chemistry with HPTU (O-Benzotriazole-N^^'^'-tetramethyluronium 
hexafluorophosphate) activation. A Gly-Cys-Gly sequence may be attached to the amino 
terminus of the peptide to provide a method of conjugation or labeling of the peptide. 
Cleavage of the peptides from the solid support may be carried out using the following 

15 cleavage mixture: trifluoroacetic acid:ethanedithiol:thioanisole:water:phenol (40:1:2:2:3). 
After cleaving for 2 hours, the peptides may be precipitated in cold methyl-t-butyl-ether. The 
peptide pellets may then be dissolved in water containing 0.1% trifluoroacetic acid (TFA) and 
lyophilized prior to purification by CI 8 reverse phase HPLC. A gradient of 0%-60% 
acetonitrile (containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the 

20 peptides. Following lyophilization of the pure fractions, the peptides may be characterized 
using electrospray mass spectrometry and by amino acid analysis. 
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EXAMPLE 7 

Preparation and Characterization of A£ Tuberculosis Fusion Proteins 

A fusion protein containing TbRa3 5 the 38 kD antigen and Tb38-1 was 
5 prepared as follows. 

Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PCR 
in order to facilitate their fusion and the subsequent expression of the fusion protein TbRa3- 
38 kD-Tb38-l. TbRa3, 38 kD and Tb38-1 DNA was used to perform PCR using the primers 
PDM-64 and PDM-65 (SEQ ID NO: 146 and 147), PDM-57 and PDM-58 (SEQ ID NO: 148 

10 and 149), and PDM-69 and PDM-60 (SEQ ID NO: 150 and 151), respectively. In each case, 
the DNA amplification was performed using 10 |al 10X Pfu buffer, 2 jjJ 10 mM dNTPs, 2 jxl 
each of the PCR primers at 10 \±M concentration, 81.5 \i\ water, 1 .5 jal Pfu DNA polymerase 
(Stratagene, La Jolla, CA) and 1 (il DNA at either 70 ng/jxl (for TbRa3) or 50 ng/jal (for 38 
kD and Tb38-1). For TbRa3, denaturation at 94°C was performed for 2 min, followed by 40 

15 cycles of 96°C for 15 sec and 72°C for 1 min, and lastly by 72°C for 4 min. For 38 kD, 
denaturation at 96°C was performed for 2 min, followed by 40 cycles of 96°C for 30 sec, 
68°C for 15 sec and 72°C for 3 min, and finally by 72°C for 4 min. For Tb38-1 denaturation 
at 94°C for 2 min was followed by 10 cycles of 96°C for 15 sec, 68°C for 15 sec and 72°C for 
1.5 min, 30 cycles of 96°C for 15 sec, 64°C for 15 sec and 72°C for 1.5, and finally by 72°C 

20 for 4 min. 

The TbRa3 PCR fragment was digested with Ndel arid EcoRJ and cloned 
directly into pT7 A L2 IL 1 vector using Ndel and EcoRI sites. The 38 kD PCR fragment was 
digested with Sse8387I, treated with T4 DNA polymerase to make blunt ends and then 
digested with EcoRI for direct cloning into the pT7 A L2Ra3-l vector which was digested with 

25 StuI and EcoRI. The 38-1 PCR fragment was digested with Eco47III and EcoRI and directly 
subcloned into pT7 A L2Ra3/38kD-17 digested with the same enzymes. The whole fusion was 
then transferred to pET28b - using Ndel and EcoRI sites. The fusion construct was 
confirmed by DNA sequencing. 

The expression construct was transformed into BLR pLys S E. coli (Novagen, 

30 Madison, WI) and grown overnight in LB broth with kanamycin (30 jag/ml) and 
chloramphenicol (34 jag/ml). This culture (12 ml) was used to inoculate 500 ml 2XYT with 
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the same antibiotics and the culture was induced with IPTG at an OD560 of 0.44 to a final 
concentration of 1.2 mM. Four hours post-induction, the bacteria were harvested and 
sonicated in 20 mM Tris (8.0), 100 mM NaCl, 0.1% DOC, 20 ug/ml Leupeptin, 20 mM 
PMSF followed by centrifugation at 26,000 X g. The resulting pellet was resuspended in 8 M 
5 urea, 20 mM Tris (8.0), 100 mM NaCl and bound to Pro-bond nickel resin (Invitrogen, 
Carlsbad, CA). The column was washed several times with the above buffer then eluted with 
an imidazole gradient (50 mM, 100 mM, 500 mM imidazole was added to 8 M urea, 20 mM 
Tris (8.0), 100 mM NaCl). The eluates containing the protein of interest were then dialzyed 

against 10 mM Tris (8.0). 
10 The DNA and amino acid sequences for the resulting fusion protein 

(hereinafter referred to as TbRa3-38 kD-Tb38-l) are provided in SEQ ID NO: 152 and 153, 
respectively. 

A fusion protein containing the two antigens TbH-9 and Tb38-1 (hereinafter 
referred to as TbH9-Tb38-l) without a hinge sequence, was prepared using a similar 
15 procedure to that described above. The DNA sequence for the TbH9-Tb38-l fusion protein is 

provided in SEQ ID NO: 1 56. 

The ability of the fusion protein TbH9-Tb38-l to induce T cell proliferation 
and IFN-y production in PBMC preparations was examined using the protocol described 
above in Example 1 . PBMC from three donors were employed: one who had been previously 
20 shown to respond to TbH9 but not Tb38-1 (donor 131); one who had been shown to respond 
to Tb38-1 but not TbH9 (donor 184); and one who had been shown to respond to both 
antigens (donor 201). The results of these studies (Figs. 5-7, respectively) demonstrate the 
functional activity of both the antigens in the fusion protein. 

A fusion protein containing TbRa3, the antigen 38kD, Tb38-1 and DPEP was 

25 prepared as follows. 

Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PCR 
and cloned into vectors essentially as described above, with the primers PDM-69 (SEQ ID 
NO:150 and PDM-83 (SEQ ID NO: 205) being used for amplification of the Tb38-1A 
fragment. Tb38-1 A differs from Tb38-1 by a Dral site at the 3' end of the coding region that 
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keeps the final amino acid intact while creating a blunt restriction site that is in frame. The 
TbRa3/38kD/Tb38-l A fusion was then transferred to pET28b using Ndel and EcoRl sites. 

DPEP DNA was used to perform PCR using the primers PDM-84 and PDM- 
85 (SEQ ID NO: 206 and 207 ? respectively) and 1 jal DNA at 50 ng/jal. Denaturation at 94 °C 
5 was performed for 2 min, followed by 10 cycles of 96 °C for 15 sec, 68 °C for 15 sec and 72 
°C for 1.5 min; 30 cycles of 96 °C for 15 see, 64 °C for 15 sec and 72 °C for 1.5 min; and 
finally by 72 °C for 4 min. The DPEP PCR fragment was digested with EcoRl and Eco72I 
and clones directly into the pET28Ra3/38kD/38-l A construct which was digested with Dral 
and EcoRl. The fusion construct was confirmed to be correct by DNA sequencing. 

10 Recombinant protein was prepared as described above. The DNA and amino acid sequences 
for the resulting fusion protein (hereinafter referred to as TbF-2) are provided in SEQ ID NO: 
208 and 209, respectively. 

The reactivity of the fusion protein TbF-2 with sera from M. tuberculosis- 
infected patients was examined by ELISA using the protocol described above. The results of 

1 5 these studies (Table 11) demonstrate that all four antigens function independently in the 
fusion protein. 
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Table 1 1 

Reactivity of TbF-2 Fusion Recombinant with TB and Normal Sera 



Serum ID 


Status 


TbF 
OD450 


Status 


TbF-2 
OD450 


Status 


ELISA R 


^activity 
















38 kD 


TbRa3 


1 Djo- 1 




B931-40 


TB 


0.57 


+ 


0.321 










_j_ 


B931-41 


TB 


0.601 


+ 


0.396 


+ 


+ i 


+ 


_L 




B931-109 


TB 


0.494 


+ 


0.404 


+ 




+ 






B931-132 


TB 


1.502 


+ 


1.292 


+ 




+ 




4. 


5004 


TB 


1.806 


+ 


1 .666 


+ 


+ 


± 


1 




15004 


TB 


2.862 


+ 


2.468 


+ 


+ 


+ 


4- 
i 




39004 


TB 


2.443 




1.722 




+ 




_|_ 




68004 


TB 


2.871 


+ 


2.575 


+ 


+ 




_|_ 




99004 


TB 


0.691 


+ 


0.971 


+ 


- 


+ 


+ 


— 


107004 


TB 


0.875 


+ 


0.732 


+ 


- 


± 






92004 


TB 


1.632 


+ 


1.394 


+ 




± 






97004 


TB 


1.491 


+ 


1.979 


+ 




± 






118004 


TB 


3.182 




3.045 




+ 


+ 






173004 


TB 


3.644 


+ 


3.578 


+ 


+ 


+ 






175004 


TB 


3.332 




2.916 


+ 


+ 


+ 






274004 


TB 


3.696 


+ 


3.716 




- 


+ 






276004 


TB 


3.243 


+ 


2.56 


+ 


- 


- 






282004 


TB 


1.249 


+ 


1.234 


+ 


+ 


- 






289004 


TB 


1.373 


+ 


l .17 


+ 


• 


+ 






308004 


TB 


3.708 


+ 


3.355 


+ 


- 


- 






314004 


TB 


1.663 


+ 


1.399 


+ 


- 




T 




317004 


TB 


1.163 


+ 


0.92 


+ 


+ 


- 






312004 


TB 


1.709 


+ 


1.453 


+ 


- 


+ 






380004 


TB 


0.238 




0.461 


+ 


- 


± 






451004 


TB 


0.18 




0.2 




- 


- 




_|_ 


478004 


TB 


0.188 




0.469 


+ 


- 


- 




_f 


410004 


TB 


0.384 


+ 


2.392 


+ 


± 






_|_ 


411004 


TB 


0.306 


+ 


0.874 


+ 










421004 


TB 


0.357 


+ 


1.456 


+ 












1 Jt> 


n (\ai 
u.u*t / 




0.196 










+ 


A6-87 


Normal 


0.094 




0.063 












A6-88 


Normal 


0.214 




0.19 












A6-89 


Normal 


0.248 




0.125 












A6-90 


Normal 


0.179 




0.206 












A6-91 


Normal 


0.135 




0.151 












A6-92 


Normal 


0.064 




0.097 












A6-93 


Normal 


0.072 




0.098 












A6-94 


Normal 


0.072 




0.064 












A6-95 


Normal 


0.125 




0.159 












A6-96 


Normal 


0.121 




0.12 
































Cut-off 




0.284 




0.266 
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One of skill in the art will appreciate that the order of the individual antigens 
within the fusion protein may be changed and that comparable activity would be expected 
provided each of the epitopes is still functionally available. In addition, truncated forms of 
the proteins containing active epitopes may be used in the construction of fusion proteins. 

5 

From the foregoing, it will be appreciated that, although specific embodiments 
of the invention have been described herein for the purpose of illustration, various 
modifications may be made without deviating from the spirit and scope of the invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANTS: Reed, Steven G. 

Skeiky, Yasir A.W. 
Dillon, Davin C. 
Campos-Neto, Antonio 
Houghton, Raymond 
Vedvick, Thomas S. 
Twardzik, Daniel R. 
Lodes, Michael J. 

(ii) TITLE OF INVENTION: COMPOUNDS AND METHODS FOR IMMUNOTHERAPY 
AND DIAGNOSIS OF TUBERCULOSIS 

(iii) NUMBER OF SEQUENCES: 214 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center, 701 Fifth Avenue 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP : 98104-7092 

(v) COMPUTER READABLE FORM: 

- (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 01-OCT-1997 

( C ) CLAS S I FI CAT I ON : 

(viii) ATTORNEY/ AGENT INFORMATION: 
{A) NAME: Maki, David J. 

(B) REGISTRATION NUMBER: 31,392 

(C) REFERENCE /DOCKET NUMBER: 210121. 4 11C7 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 682-6031 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 766 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
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CGAGGCACCG 


GTAGTTTGAA 


CCAAACGCAC 


AATCGACGGG 


CAAACGAACG 


GAAGAACACA 


60 


ACCATGAAGA 


TGGTGAAATC 


GATCGCCGCA 


GGTCTGACCG 


CCGCGGCTGC 


AATCGGCGCC 


120 


GCTGCGGCCG 


GTGTGACTTC 


GATCATGGCT 


GGCGGCCCGG 


TCGTATACCA 


GATGCAGCCG 


180 


GTCGTCTTCG 


GCGCGCCACT 


GCCGTTGGAC 


CCGGCATCCG 


CCCCTGACGT 


CCCGACCGCC 


240 


GCCCAGTTGA 


CCAGCCTGCT 


CAACAGCCTC 


GCCGATCCCA 


ACGTGTCGTT 


TGCGAACAAG 


300 


GGCAGTCTGG 


TCGAGGGCGG 


CATCGGGGGC 


ACCGAGGCGC 


GCATCGCCGA 


CCACAAGCTG 


360 


AAGAAGGCCG 


CCGAGCACGG 


GGATCTGCCG 


CTGTCGTTCA 


GCGTGACGAA 


CATCCAGCCG 


420 


GCGGCCGCCG 


GTTCGGCCAC 


CGCCGACGTT 


TCCGTCTCGG 


GTCCGAAGCT 


CTCGTCGCCG 


480 


GTCACGCAGA 


ACGTCACGTT 


CGTGAATCAA 


GGCGGCTGGA 


TGCTGTCACG 


CGCATCGGCG 


540 


ATGGAGTTGC 


TGCAGGCCGC 


AGGGNAACTG 


ATTGGCGGGC 


CGGNTTCAGC 


CCGCTGTTCA 


600 


GCTACGCCGC 


CCGCCTGGTG 


ACGCGTCCAT 


GTCGAACACT 


CGCGCGTGTA 


GCACGGTGCG 


660 


GTNTGCGCAG 


GGNCGCACGC 


ACCGCCCGGT 


GCAAGCCGTC 


CTCGAGATAG 


GTGGTGNCTC 


720 


GNCACCAGNG 


ANCACCCCCN 


NNTCGNCNNT 


TCTCGNTGNT 


GNATGA 




766 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 752 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

AT G C AT C AC C ATCACCATCA CGATGAAGTC ACGGTAGAGA CGACCTCCGT CTTCCGCGCA 60 

GACTTCCTCA GCGAGCTGGA CGCTCCTGCG CAAGCGGGTA CGGAGAGCGC GGTCTCCGGG 12 0 

GTGGAAGGGC TCCCGCCGGG CTCGGCGTTG CTGGTAGTCA AACGAGGCCC CAACGCCGGG 180 

TCCCGGTTCC TACTCGACCA AGCCATCACG TCGGCTGGTC GGCATCCCGA C AG C G AC AT A 24 0 

TTTCTCGACG ACGTGACCGT GAGCCGTCGC CATGCTGAAT TCCGGTTGGA AAACAACGAA 300 

TTCAATGTCG TCGATGTCGG GAGTCTCAAC GGCACCTACG TCAACCGCGA GCCCGTGGAT 3 60 

TCGGCGGTGC TGGCGAACGG CGACGAGGTC CAGATCGGCA AGCTCCGGTT GGTGTTCTTG 4 20 

ACCGGACCCA AGCAAGGCGA GGATGACGGG AGTACCGGGG GCCCGTGAGC GCACCCGATA 4 80 

GCCCCGCGCT GGCCGGGATG TCGATCGGGG CGGTCCTCCG ACCTGCTACG ACCGGATTTT 54 0 

CCCTGATGTC CACCATCTCC AAGATTCGAT TCTTGGGAGG CTTGAGGGTC NGGGTGACCC 600 

CCCCGCGGGC CTCATTCNGG GGTNTCGGCN GGTTTCACCC CNTACCNACT GCCNCCCGGN 660 
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TTGCNAATTC NTTCTTCNCT GCCCNNAAAG GGACCNTTAN CTTGCCGCTN GAAANGGTNA 720 
TCCNGGGCCC NTCCTNGAAN CCCCNTCCCC CT 7 52 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

CATATGCATC ACCATCACCA TCACACTTCT AACCGCCCAG CGCGTCGGGG GCGTCGAGCA 60 

CCACGCGACA CCGGGCCCGA TCGATCTGCT AGCTTGAGTC TGGTCAGGCA TCGTCGTCAG 12 0 

CAGCGCGATG CCCTATGTTT GTCGTCGACT CAGATATCGC GGCAATCCAA TCTCCCGCCT .180 

GCGGCCGGCG GTGCTGCAAA CTACTCCCGG AGGAATTTCG ACGTGCGCAT CAAGATCTTC 24 0 

ATGCTGGTCA CGGCTGTCGT TTTGCTCTGT TGTTCGGGTG TGGCCACGGC CGCGCCCAAG 300 

ACCTACTGCG AGGAGTTGAA AGGCACCGAT ACCGGCCAGG CGTGCCAGAT TCAAATGTCC 360 

GACCCGGCCT ACAACATCAA CATCAGCCTG CCCAGTTACT ACCCCGACCA GAAGTCGCTG 4 20 

G AAAAT T AC A TCGCCCAGAC GCGCGACAAG TTCCTCAGCG CGGCCACATC GTCCACTCCA 4 80 

CGCGAAGCCC CCTACGAATT GAATATCACC TCGGCCACAT ACCAGTCCGC GATACCGCCG 54 0 

CGTGGTACGC AGGCCGTGGT GCTCAMGGTC TACCACAACG CCGGCGGCAC GCACCCAACG 600 

ACCACGTACA AGGCCTTCGA TTGGGACCAG GCCTATCGCA AGCCAATCAC CTATGACACG 660 

CTGTGGCAGG CTGACACCGA TCCGCTGCCA GTCGTCTTCC CCATTGTTGC AAGGTGAACT 7 20 

GAGCAACGCA G AC C G G G AC A ACWGGTATCG ATAGCCGCCN AATGCCGGCT TGGAACCCNG 7 80 

TGAAATTATC AC AACT TCGC AGTCACNAAA NAA 813 
(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CGGTATGAAC ACGGCCGCGT CCGATAACTT CCAGCTGTCC CAGGGTGGGC AGGGATTCGC 60 
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. i x ^ \s on x v — 


GGGCAGGCGA 


TGGCGATCGC 






np.r:r:nTp app 


i on 
1<-U 




ATCGGGCCTA 


CCGCCTTCCT 

w V* Vrf">w X -1 


CGGPTTGGGT 


GTTGTPGACA 


A P A A P n. P P a 21 


i on 
1 o U 


CGGCGCACGA 


GTCCAACGCG 


TGGTCGGGAG 


CGCTCCGGCG 


GCAAGTCTCG 


GCATCTCCAC 


240 


CGGCGACGTG 


ATCACCGCGG 


TCGACGGCGC 


TCCGATCAAC 


TCGGCCACCG 


CGATGGCGGA 


300 


CGCGCTTAAC 


GGGCATCATC 


CCGGTGACGT 


CATCTCGGTG 


AACTGGCAAA 


CCAAGTCGGG 


360 


CGGCACGCGT 


ACAGGGAACG 


TGACATTGGC 


CGAGGGACCC 


CCGGCCTGAT 


TTCGTCGYGG 


420 


ATACCACCCG 


CCGGCCGGCC 


AATTGGA 








447 



(2) I N FORMAT I ON FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



GTCCCACTGC 


GGTCGCCGAG 


TATGTCGCCC 


AGCAAATGTC 


TGGCAGCCGC 


CCAACGGAAT 


60 


CCGGTGATCC 


GACGTCGCAG 


GTTGTCGAAC 


CCGCCGCCGC 


GGAAGTATCG 


GTCCATGCCT 


120 


AGCCCGGCGA 


CGGCGAGCGC 


CGGAATGGCG 


CGAGTGAGGA 


GGCGGGCAAT 


TTGGCGGGGC 


180 


CCGGCGACGG 


NGAGCGCCGG 


AATGGCGCGA 


GTGAGGAGGT 


GGNCAGTCAT 


GCCCAGNGTG 


240 


ATCCAATCAA 


CCTGNATTCG 


GNCTGNGGGN 


CCATTTGACA 


ATCGAGGTAG 


TGAGCGCAAA 


300 


TGAATGATGG 


AAAACGGGNG 


GNGACGTCCG 


NTGTTCTGGT 


GGTGNTAGGT 


GNCTGNCTGG 


360 


NGTNGNGGNT 


ATCAGGATGT 


TCTTCGNCGA 


AANCTGATGN 


CGAGGAACAG 


GGTGTNCCCG 


420 


NNANNCCNAN 


GGNGTCCNAN 


CCCNNNNTCC 


TCGNCGANAT 


CANANAGNCG 


NTTGATGNGA 


480 


NAAAAGGGTG 


GANCAGNNNN 


AANTNGNGGN 


CCNAANAANC 


NNNANNGNNG 


NNAGNTNGNT 


540 


NNNTNTTNNC 


ANNNNNNNTG 


NNGNNGNNCN 


NNNCAANCNN 


NTNNNNGNAA 


NNGGNTTNTT 


600 


NAAT 












604 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TTGCANGTCG AACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCG CGGTGGCGGC 60 
CGCTCTAGAA CTAGTGKATM YYYCKGGCTG CAGSAATYCG GYACGAGCAT TAGGACAGTC 120 
TAACGGTCCT GTTACGGTGA TCGAATGACC GACGACATCC TGCTGATCGA CACCGACGAA 180 
CGGGTGCGAA CCCTCACCCT CAACCGGCCG CAGTCCCGYA ACGCGCTCTC GGCGGCGCTA 
CGGGATCGGT TTTTCGCGGY GTTGGYCGAC GCCGAGGYCG ACGACGACAT CGACGTCGTC 
ATCCTCACCG GYGCCGATCC GGTGTTCTGC GCCGGACTGG ACCTCAAGGT AGCTGGCCGG 
GCAGACCGCG CTGCCGGACA TCTCACCGCG GTGGGCGGCC ATGACCAAGC CGGTGATCGG 
CGCGATCAAC GGCGCCGCGG TCACCGGCGG GCTCGAACTG GCGCTGTACT GCGACATCCT 4 80 

GATCGCCTCC GAGCACGCCC GCTTCGNCGA CACCCACGCC CGGGTGGGGC TGCTGCCCAC 54 0 

CTGGGGACTC AGTGTGTGCT TGCCGCAAAA GGTCGGCATC GGNCTGGGCC GGTGGATGAG 
CCTGACCGGC GACTACCTGT CCGTGACCGA CGC 
(2) I N FORMAT I ON FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1362 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



240 
300 
360 
420 



600 
633 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



CGACGACGAC 


GGCGCCGGAG 


AGCGGGCGCG 


AACGGCGATC 


GACGCGGCCC 


TGGCCAGAGT 


60 


CGGCACCACC 


CAGGAGGGAG 


TCGAATCATG 


AAATTTGTCA 


ACCATATTGA 


GCCCGTCGCG 


120 


CCCCGCCGAG 


CCGGCGGCGC 


GGTCGCCGAG 


GTCTATGCCG 


AGGCCCGCCG 


CGAGTTCGGC 


180 


CGGCTGCCCG 


AGCCGCTCGC 


CATGCTGTCC 


CCGGACGAGG 


GACTGCTCAC 


CGCCGGCTGG 


240 


GCGACGTTGC 


GCGAGACACT 


GCTGGTGGGC 


CAGGTGCCGC 


GTGGCCGCAA 


GGAAGCCGTC 


300 


GCCGCCGCCG 


TCGCGGCCAG 


CCTGCGCTGC 


CCCTGGTGCG 


TCGACGCACA 


CACCACCATG 


360 


CTGTACGCGG 


CAGGCCAAAC 


CGACACCGCC 


GCGGCGATCT 


TGGCCGGCAC 


AGCACCTGCC 


420 


GCCGGTGACC 


CGAACGCGCC 


GTATGTGGCG 


TGGGCGGCAG 


GAACCGGGAC 


ACCGGCGGGA 


480 


CCGCCGGCAC 


CGTTCGGCCC 


GGATGTCGCC 


GCCGAATACC 


TGGGCACCGC 


GGTGCAATTC 


540 


CACTTCATCG 


CACGCCTGGT 


CCTGGTGCTG 


CTGGACGAAA 


CCTTCCTGCC 


GGGGGGCCCG 


600 


CGCGCCCAAC 


AGCTCATGCG 


CCGCGCCGGT 


GGACTGGTGT 


TCGCCCGCAA 


GGTGCGCGCG 


660 


GAGCATCGGC 


CGGGCCGCTC 


CACCCGCCGG 


CTCGAGCCGC 


GAACGCTGCC 


CGACGATCTG 


720 
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GCATGGGCAA CACCGTCCGA GCCCATAGCA ACCGCGTTCG CCGCGCTCAG CCACCACCTG 7 80 

GACACCGCGC CGCACCTGCC GCCACCGACT CGTCAGGTGG TCAGGCGGGT CGTGGGGTCG 84 0 

TGGCACGGCG AGCCAATGCC GATGAGCAGT CGCTGGACGA ACGAGCACAC CGCCGAGCTG 900 

CCCGCCGACC TGCACGCGCC CACCCGTCTT GCCCTGCTGA CCGGCCTGGC CCCGCATCAG 960 

GTGACCGACG ACGACGTCGC CGCGGCCCGA TCCCTGCTCG ACACCGATGC GGCGCTGGTT 1020 

GGCGCCCTGG CCTGGGCCGC CTTCACCGCC GCGCGGCGCA TCGGCACCTG GATCGGCGCC 1080 

GCCGCCGAGG GCCAGGTGTC GCGGCAAAAC CCGACTGGGT GAGTGTGCGC GCCCTGTCGG 114 0 

TAGGGTGTCA TCGCTGGCCC GAGGGATCTC GCGGCGGCGA ACGGAGGTGG CGACACAGGT 1200 

GGAAGCTGCG CCCACTGGCT TGCGCCCCAA CGCCGTCGTG GGCGTTCGGT TGGCCGCACT 1260 

GGCCGATCAG GTCGGCGCCG GCCCTTGGCC GAAGGTCCAG CTCAACGTGC CGTCACCGAA 1320 

GGACCGGACG GTCACCGGGG GTCACCCTGC GCGCCCAAGG AA 1362 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 58 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



GCGACGACCC 


CGATATGCCG 


GGCACCGTAG 


CGAAAGCCGT 


CGCCGACGCA 


CTCGGGCGCG 


60 


GTATCGCTCC 


CGTTGAGGAC 


ATTCAGGACT 


GCGTGGAGGC 


CCGGCTGGGG 


GAAGCCGGTC 


120 


TGGATGACGT 


GGCCCGTGTT 


TACATCATCT 


ACCGGCAGCG 


GCGCGCCGAG 


CTGCGGACGG 


180 


CTAAGGCCTT 


GCTCGGCGTG 


CGGGACGAGT 


TAAAGCTGAG 


CTTGGCGGCC 


GTGACGGTAC 


240 


TGCGCGAGCG 


CTATCTGCTG 


CACGACGAGC 


AGGGCCGGCC 


GGCCGAGTCG 


ACCGGCGAGC 


300 


T GAT GG AC CG 


ATCGGCGCGC 


TGTGTCGCGG 


CGGCCGAGGA 


CCAGTATGAG 


CCGGGCTCGT 


360 


CGAGGCGGTG 


GGCCGAGCGG 


TTCGCCACGC 


TATTACGCAA 


CCTGGAATTC 


CTGCCGAATT 


420 


CGCCCACGTT 


GATGAACTCT 


GGCACCGACC 


TGGGACTGCT 


CGCCGGCTGT 


TTTGTTCTGC 


480 


CGATTGAGGA 


TTCGCTGCAA 


TCGATCTTTG 


CGACGCTGGG 


ACAGGCCGCC 


GAGCTGCAGC 


540 


GGGCTGGAGG 


CGGCACCGGA 


TATGCGTTCA 


GCCACCTGCG 


ACCCGCCGGG 


GATCGGGTGG 


600 


CCTCCACGGG 


CGGCACGGCC 


AGCGGACCGG 


TGTCGTTTCT 


ACGGCTGTAT 


GACAGTGCCG 


660 


CGGGTGTGGT 


CTCCATGGGC 


GGTCGCCGGC 


GTGGCGCCTG 


TATGGCTGTG 


CTTGATGTGT 


720 


CGCACCCGGA 


TATCTGTGAT 


TTCGTCACCG 


CCAAGGCCGA 


ATCCCCCAGC 


GAGCTCCCGC 


780 
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ATTTCAACCT 


ATCGGTTGGT 


GTGACCGACG 


CGTTCCTGCG 


GGCCGTCGAA 


CGCAACGGCC 


840 


TACACCGGCT 


GGTCAATCCG 


CGAACCGGCA 


AGATCGTCGC 


GCGGATGCCC 


GCCGCCGAGC 


900 


TGTTCGACGC 


CATCTGCAAA 


GCCGCGCACG 


CCGGTGGCGA 


TCCCGGGCTG 


GTGTTTCTCG 


960 


ACACGATCAA 


TAGGGCAAAC 


CCGGTGCCGG 


GGAGAGGCCG 


CATCGAGGCG 


ACCAACCCGT 


1020 


GCGGGGAGGT 


CCCACTGCTG 


CCTTACGAGT 


CATGTAATCT 


CGGCTCGATC 


AACCTCGCCC 


1080 


GGATGCTCGC 


CGACGGTCGC 


GTCGACTGGG 


ACCGGCTCGA 


GGAGGTCGCC 


GGTGTGGCGG 


1140 


TGCGGTTCCT 


TGATGACGTC 


ATCGATGTCA 


GCCGCTACCC 


CTTCCCCGAA 


CTGGGTGAGG 


1200 


CGGCCCGCGC 


CACCCGCAAG 


ATCGGGCTGG 


GAGTCATGGG 


TTTGGCGGAA 


CTGCTTGCCG 


1260 


CACTGGGTAT 


TCCGTACGAC 


AGTGAAGAAG 


CCGTGCGGTT 


AGCCACCCGG 


CTCATGCGTC 


1320 


G CAT AC AG C A 


GGCGGCGCAC 


ACGGCATCGC 


GGAGGCTGGC 


CGAAGAGCGG 


GGCGCATTCC 


1380 


CGGCGTTCAC 


CG AT AGCCGG 


TTCGCGCGGT 


CGGGCCCGAG 


GCGCAACGCA 


CAGGTCACCT 


1440 


CCGTCGCTCC 


GACGGGCA 










1458 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



ACGGTGTAAT 


CGTGCTGGAT 


CTGGAACCGC 


GTGGCCCGCT 


ACCTACCGAG 


ATCTACTGGC 


60 


GGCGCAGGGG 


GCTGGCCCTG 


GGCATCGCGG 


TCGTCGTAGT 


CGGGATCGCG 


GTGGCCATCG 


120 


TCATCGCCTT 


CGTCGACAGC 


AGCGCCGGTG 


CCAAACCGGT 


CAGCGCCGAC 


AAGCCGGCCT 


180 


CCGCCCAGAG 


CCATCCGGGC 


TCGCCGGCAC 


CCCAAGCACC 


CCAGCCGGCC 


GGGCAAACCG 


240 


AAGGTAACGC 


CGCCGCGGCC 


CCGCCGCAGG 


GCCAAAACCC 


CGAGACACCC 


ACGCCCACCG 


300 


CCGCGGTGCA 


GCCGCCGCCG 


GTGCTCAAGG 


AAGGGGACGA 


TTGCCCCGAT 


TCGACGCTGG 


360 


CCGTCAAAGG 


TTTGACCAAC 


GCGCCGCAGT 


ACTACGTCGG 


CGACCAGCCG 


AAGTTCACCA 


420 


TGGTGGTCAC 


CAACATCGGC 


CTGGTGTCCT 


GTAAACGCGA 


CGTTGGGGCC 


GCGGTGTTGG 


480 


CCGCCTACGT 


TTACTCGCTG 


GACAACAAGC 


GGTTGTGGTC 


CAACCTGGAC 


TGCGCGCCCT 


540 


CGAATGAGAC 


GCTGGTCAAG 


ACGTTTTCCC 


CCGGTGAGCA 


GGTAACGACC 


GCGGTGACCT 


600 


GGACCGGGAT 


GGGATCGGCG 


CCGCGCTGCC 


CATTGCCGCG 


GCCGGCGATC 


GGGCCGGGCA 


660 


CCTACAATCT 


CGTGGTACAA 


CTGGGCAATC 


TGCGCTCGCT 


GCCGGTTCCG 


TTCATCCTGA 


720 
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ATCAGCCGCC GCCGCCGCCC GGGCCGGTAC CCGCTCCGGG TCCAGCGCAG GCGCCTCCGC 780 

CGGAGTCTCC CGCGCAAGGC GGATAATTAT TGATCGCTGA TGGTCGATTC CGCCAGCTGT 84 0 

GACAACCCCT CGCCTCGTGC CG 8 62 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TTGATCAGCA 


CCGGCAAGGC 


GTCACATGCC 


TCCCTGGGTG 


TGCAGGTGAC 


C AAT G AC AAA 


60 


GACACCCCGG 


GCGCCAAGAT 


CGTCGAAGTA 


GTGGCCGGTG 


GTGCTGCCGC 


GAACGCTGGA 


120 


GTGCCGAAGG 


GCGTCGTTGT 


CACCAAGGTC 


GACGACCGCC 


CGATCAACAG 


CGCGGACGCG 


180 


TTGGTTGCCG 


CCGTGCGGTC 


CAAAGCGCCG 


GGCGCCACGG 


TGGCGCTAAC 


CTTTCAGGAT 


240 


CCCTCGGGCG 


GTAGCCGCAC 


AGTGCAAGTC 


ACCCTCGGCA 


AGGCGGAGCA 


GTGATGAAGG 


300 


TCGCCGCGCA 


GTGTTCAAAG 


CTCGGATATA 


CGGTGGCACC 


CAT GGAACAG 


CGTGCGGAGT 


360 


TGGTGGTTGG 


CCGGGCACTT 


GTCGTCGTCG 


TTGACGATCG 


CACGGCGCAC 


GGCGATGAAG 


420 


ACCACAGCGG 


GCCGCTTGTC 


ACCGAGCTGC 


TCACCGAGGC 


CGGGTTTGTT 


GTCGACGGCG 


480 


TGGTGGCGGT 


GTCGGCCGAC 


GAGGTCGAGA 


TCCGAAATGC 


GCTGAACACA 


GCGGTGATCG 


540 


GCGGGGTGGA 


CCTGGTGGTG 


TCGGTCGGCG 


GGACCGGNGT 


GACGNCTCGC 


GATGTCACCC 


600 


CGGAAGCCAC 


CCGNGACATT 


CT 








622 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GGCGCAGCGG TAAGCCTGTT GGCCGCCGGC ACACTGGTGT TGACAGCATG CGGCGGTGGC 60 

ACCAACAGCT CGTCGTCAGG CGCAGGCGGA ACGTCTGGGT CGGTGCACTG CGGCGGCAAG 12 0 

AAGGAGCTCC ACTCCAGCGG CTCGACCGCA CAAGAAAATG CCATGGAGCA GTTCGTCTAT 180 
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GCCTACGTGC 


GATCGTGCCC 


GGGCTACACG 


TTGGACTACA 


ACGCCAACGG 


GTCCGGTGCC 


240 


GGGGTGACCC 


AGTTTCTCAA 


CAACGAAACC 


GATTTCGCCG 


GCTCGGATGT 


CCCGTTGAAT 


300 


CCGTCGACCG 


GTCAACCTGA 


CCGGTCGGCG 


GAGCGGTGCG 


GTTCCCCGGC 


ATGGGACCTG 


360 


CCGACGGTGT 


TCGGCCCGAT 


CGCGATCACC 


TACAATATCA 


AGGGCGTGAG 


CACGCTGAAT 


420 


CTTGACGGAC 


CCACTACCGC 


CAAGATTTTC 


AACGGCACCA 


TCACCGTGTG 


GAATGATCCA 


480 


CAGATCCAAG 


CCCTCAACTC 


CGGCACCGAC 


CTGCCGCCAA 


CACCGATTAG 


CGTTATCTTC 


540 


CGCAGCGACA 


AGTCCGGTAC 


GTCGGACAAC 


TTCCAGAAAT 


ACCTCGACGG 


TGTATCCAAC 


600 


GGGGCGTGGG 


GCAAAGGCGC 


CAGCGAAACG 


TTCAGCGGGG 


GCGTCGGCGT 


CGGCGCCAGC 


660 


GGGAACAACG 


GAACGTCGGC 


CCTACTGCAG 


ACGACCGACG 


GGTCGATCAC 


CTACAACGAG 


720 


TGGTCGTTTG 


CGGTGGGTAA 


GCAGTTGAAC 


ATGGCCCAGA 


TCATCACGTC 


GGCGGGTCCG 


780 


GATCCAGTGG 


CGATCACCAC 


CGAGTCGGTC 


GGTAAGACAA 


TCGCCGGGGC 


C AAG AT CAT G 


840 


GGACAAGGCA 


ACGACCTGGT 


AT T G G AC AC G 


TCGTCGTTCT 


ACAGACCCAC 


CCAGCCTGGC 


900 


TCTTACCCGA 


TCGTGCTGGC 


G AC C T AT GAG 


ATCGTCTGCT 


CGAAATACCC 


GGATGCGACG 


960 


ACCGGTACTG 


CGGTAAGGGC 


GTTTATGCAA 


GCCGCGATTG 


GTCCAGGCCA 


AGAAGGCCTG 


1020 


GACCAATACG 


GCTCCATTCC 


GTTGCCCAAA 


TCGTTCCAAG 


CAAAATTGGC 


GGCCGCGGTG 


1080 


AATGCTATTT 


CTTGACCTAG 


TGAAGGGAAT 


TCGACGGTGA 


GCGATGCCGT 


TCCGCAGGTA 


1140 


GGGTCGCAAT 


TTGGGCCGTA 


TCAGCTATTG 


CGGCTGCTGG 


GCCGAGGCGG 


GATGGGCGAG 


1200 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GCAAGCAGCT 


GCAGGTCGTG 


CTGTTCGACG 


AACTGGGCAT 


GCCGAAGACC 


AAACGCACCA 


60 


AGACCGGCTA 


CACCACGGAT 


GCCGACGCGC 


TGCAGTCGTT 


GTTCGACAAG 


ACCGGGCATC 


120 


CGTTTCTGCA 


ACATCTGCTC 


GCCCACCGCG 


ACGTCACCCG 


GCTCAAGGTC 


ACCGTCGACG 


180 


GGTTGCTCCA 


AGCGGTGGCC 


GCCGACGGCC 


GCATCCACAC 


CACGTTCAAC 


CAGACGATCG 


240 


CCGCGACCGG 


CCGGCTCTCC 


TCGACCGAAC 


CCAACCTGCA 


GAACATCCCG 


ATCCGCACCG 


300 


ACGCGGGCCG 


GCGGATCCGG 


GACGCGTTCG 


TGGTCGGGGA 


CGGTTACGCC 


GAGTTGATGA 


360 


CGGCCGACTA 


CAGCCAGATC 


GAGATGCGGA 


TCATGGGGCA 


CCTGTCCGGG 


GACGAGGGCC 


420 
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TCATCGAGGC 


GTTCAACACC 


GGGGAGGACC 


TGTATTCGTT 


CGTCGCGTCC 


CGGGTGTTCG 


4 80 


GTGTGCCCAT 


CGACGAGGTC 


ACCGGCGAGT 


TGCGGCGCCG 


GGTCAAGGCG 


ATGTCCTACG 


540 


GGCTGGTTTA 


CGGGTTGAGC 


GCCTACGGCC 


TGTCGCAGCA 


GTTGAAAATC 


TCCACCGAGG 


600 


AAGCCAACGA 


GCAGATGGAC 


GCGTATTTCG 


CCCGATTCGG 


CGGGGTGCGC 


GACTACCTGC 


660 


GCGCCGTAGT 


CGAGCGGGCC 


CGCAAGGACG 


GCTACACCTC 


GACGGTGCTG 


GGCCGTCGCC 


720 


GCTACCTGCC 


CGAGCTGGAC 


AGCAGCAACC 


GTCAAGTGCG 


GGAGGCCGCC 


GAGCGGGCGG 


780 


CGCTGAACGC 


GCCGATCCAG 


GGCAGCGCGG 


CCGACATCAT 


CAAGGTGGCC 


ATGATCCAGG 


84 0 


TCGACAAGGC 


GCTCAACGAG 


GCACAGCTGG 


CGTCGCGCAT 


GCTGCTGCAG 


GTCCACGACG 


900 


AGCTGCTGTT 


CGAAATCGCC 


CCCGGTGAAC 


GCGAGCGGGT 


CGAGGCCCTG 


GTGCGCGACA 


960 


AGATGGGCGG 


CGCTTACCCG 


CTCGACGTCC 


CGCTGGAGGT 


GTCGGTGGGC 


TACGGCCGCA 


1020 


GCTGGGACGC 


GGCGGCGCAC 


TGAGTGCCGA 


GCGTGCATCT 


GGGGCGGGAA 


TTCGGCGATT 


1080 


TTTCCGCCCT 


GAGTTCACGC 


TCGGCGCAAT 


CGGGACCGAG 


TTTGTCCAGC 


GTGTACCCGT 


1140 


CGAGTAGCCT 


CGTCA 










1155 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1771 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GAGCGCCGTC TGGTGTTTGA ACGGTTTTAC CGGTCGGCAT CGGCACGGGC GTTGCCGGGT 60 

TCGGGCCTCG GGTTGGCGAT CGTCAAACAG GTGGTGCTCA ACCACGGCGG ATTGCTGCGC 12 0 

ATCGAAGACA CCGACCCAGG CGGCCAGCCC CCTGGAACGT CGATTTACGT GCTGCTCCCC 180 

GGCCGTCGGA TGCCGATTCC GCAGCTTCCC GGTGCGACGG CTGGCGCTCG GAGCACGGAC 24 0 

ATCGAGAACT CTCGGGGTTC GGCGAACGTT ATCTCAGTGG AATCTCAGTC CACGCGCGCA 300 

ACCTAGTTGT GCAGTTACTG TTGAAAGCCA CACCCATGCC AGTCCACGCA TGGCCAAGTT 360 

GGCCCGAGTA GTGGGCCTAG TACAGGAAGA GCAACCTAGC GACATGACGA ATCACCCACG 4 20 

GTATTCGCCA CCGCCGCAGC AGCCGGGAAC CCCAGGTTAT GCTCAGGGGC AGCAGCAAAC 4 80 

GTACAGCCAG CAGTTCGACT GGCGTTACCC ACCGTCCCCG CCCCCGCAGC CAACCCAGTA 54 0 

CCGTCAACCC TACGAGGCGT TGGGTGGTAC CCGGCCGGGT CTGATACCTG GCGTGATTCC 600 

GACCATGACG CCCCCTCCTG GGATGGTTCG CCAACGCCCT CGTGCAGGCA TGTTGGCCAT 660 
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CGGCGCGGTG ACGATAGCGG TGGTGTCCGC CGGCATCGGC GGCGCGGCCG CATCCCTGGT 720 
CGGGTTCAAC CGGGCACCCG CCGGCCCCAG CGGCGGCCCA GTGGCTGCCA GCGCGGCGCC 780 
AAGCATCCCC GCAGCAAACA TGCCGCCGGG GTCGGTCGAA CAGGTGGCGG CCAAGGTGGT 840 
GCCCAGTGTC GTCATGTTGG AAACCGATCT GGGCCGCCAG TCGGAGGAGG GCTCCGGCAT 900 
CATTCTGTCT GCCGAGGGGC TGATCTTGAC CAACAACCAC GTGATCGCGG CGGCCGCCAA 960 
GCCTCCCCTG GGCAGTCCGC CGCCGAAAAC GACGGTAACC TTCTCTGACG GGCGGACCGC 1020 
ACCCTTCACG GTGGTGGGGG CTGACCCCAC CAGTGATATC GCCGTCGTCC GTGTTCAGGG 
CGTCTCCGGG CTCACCCCGA TCTCCCTGGG TTCCTCCTCG GACCTGAGGG TCGGTCAGCC 
GGTGCTGGCG ATCGGGTCGC CGCTCGGTTT GGAGGGCACC GTGACCACGG GGATCGTCAG 
CGCTCTCAAC CGTCCAGTGT CGACGACCGG CGAGGCCGGC AACCAGAACA CCGTGCTGGA 
CGCCATTCAG ACCGACGCCG CGATCAACCC CGGTAACTCC GGGGGCGCGC TGGTGAACAT 
GAACGCTCAA CTCGTCGGAG TCAACTCGGC CATTGCCACG CTGGGCGCGG ACTCAGCCGA 
TGCGCAGAGC GGCTCGATCG GTCTCGGTTT TGCGATTCCA GTCGACCAGG CCAAGCGCAT 
CGCCGACGAG TTGATCAGCA CCGGCAAGGC GTCACATGCC TCCCTGGGTG TGCAGGTGAC 
CAATGACAAA GACACCCCGG GCGCCAAGAT CGTCGAAGTA GTGGCCGGTG GTGCTGCCGC 
GAACGCTGGA GTGCCGAAGG GCGTCGTTGT CACCAAGGTC GACGACCGCC CGATCAACAG 
CGCGGACGCG TTGGTTGCCG CCGTGCGGTC CAAAGCGCCG GGCGCCACGG TGGCGCTAAC 
CTTTCAGGAT CCCTCGGGCG GTAGCCGCAC AGTGCAAGTC ACCCTCGGCA AGGCGGAGCA 
GTGATGAAGG TCGCCGCGCA GTGTTCAAAG C 
(2.) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1771 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



CTCCACCGCG 


GTGGCGGCCG 


CTCTAGAACT 


AGTGGATCCC 


CCGGGCTGCA 


GGAATTCGGC 


60 


ACGAGGATCC 


GACGTCGCAG 


GTTGTCGAAC 


CCGCCGCCGC 


GGAAGTATCG 


GTCCATGCCT 


120 


AGCCCGGCGA 


CGGCGAGCGC 


CGGAATGGCG 


CGAGTGAGGA 


GGCGGGCAAT 


TTGGCGGGGC 


180 


CCGGCGACGG 


CGAGCGCCGG 


AATGGCGCGA 


GTGAGGAGGC 


GGGCAGTCAT 


GCCCAGCGTG 


240 


ATCCAATCAA 


CCTGCATTCG 


GCCTGCGGGC 


CCATTTGACA 


ATCGAGGTAG 


TGAGCGCAAA 


300 
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TGAATGATGG AAAACGGGCG GTGACGTCCG CTGTTCTGGT GGTGCTAGGT GCCTGCCTGG 360 

CGTTGTGGCT ATCAGGATGT TCTTCGCCGA AACCTGATGC CGAGGAACAG GGTGTTCCCG 4 20 

TGAGCCCGAC GGCGTCCGAC CCCGCGCTCC TCGCCGAGAT CAGGCAGTCG CTTGATGCGA 4 80 

CAAAAGGGTT GACCAGCGTG CACGTAGCGG TCCGAACAAC CGGGAAAGTC GACAGCTTGC 54 0 

TGGGTATTAC CAGTGCCGAT GTCGACGTCC GGGCCAATCC GCTCGCGGCA AAGGGCGTAT 600 

GCACCTACAA CGACGAGCAG GGTGTCCCGT TTCGGGTACA AGGCGACAAC ATCTCGGTGA 660 

AACTGTTCGA CGACTGGAGC AATCTCGGCT CGATTTCTGA ACTGTCAACT TCACGCGTGC 7 20 

TCGATCCTGC CGCTGGGGTG ACGCAGCTGC TGTCCGGTGT CACGAACCTC CAAGCGCAAG 780 

GTACCGAAGT GATAGACGGA ATTTCGACCA CCAAAATCAC CGGGACCATC CCCGCGAGCT 84 0 

CTGTCAAGAT GCTTGATCCT GGCGCCAAGA GTGCAAGGCC GGCGACCGTG TGGATTGCCC 900 

AGGACGGCTC GCACCACCTC GTCCGAGCGA GCATCGACCT CGGATCCGGG TCGATTCAGC 960 

TCACGCAGTC GAAATGGAAC GAACCCGTCA ACGTCGACTA GGCCGAAGTT GCGTCGACGC 1020 

GTTGNTCGAA ACGCCCTTGT GAACGGTGTC AACGGNAC 1058 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GAATTCGGCA CGAGAGGTGA TCGACATCAT CGGGACCAGC CCCACATCCT GGGAACAGGC 60 

GGCGGCGGAG GCGGTCCAGC GGGCGCGGGA TAGCGTCGAT GACATCCGCG TCGCTCGGGT 120 

CATTGAGCAG GACATGGCCG TGGACAGCGC CGGCAAGATC ACCTACCGCA TCAAGCTCGA 180 

AGTGTCGTTC AAGATGAGGC CGGCGCAACC GCGCTAGCAC GGGCCGGCGA GCAAGACGCA 24 0 

AAATCGCACG GTTTGCGGTT GATTCGTGCG ATTTTGTGTC TGCTCGCCGA GGCCTACCAG 300 

GCGCGGCCCA GGTCCGCGTG CTGCCGTATC CAGGCGTGCA TCGCGATTCC GGCGGCCACG 3 60 

CCGGAGTTAA TGCTTCGCGT CGACCCGAAC TGGGCGATCC GCCGGNGAGC TGATCGATGA 4 20 

CCGTGGCCAG CCCGTCGATG CCCGAGTTGC CCGAGGAAAC GTGCTGCCAG GCCGGTAGGA 4 80 

AGCGTCCGTA GGCGGCGGTG CTGACCGGCT CTGCCTGCGC CCTCAGTGCG GCCAGCGAGC 54 0 

GG 542 
(2) INFORMATION FOR SEQ ID NO: 16: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 913 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



CGGTGCCGCC 


CGCGCCTCCG 


TTGCCCCCAT 


TGCCGCCGTC 


GCCGATCAGC 


TGCGCATCGC 


60 


CACCATCACC 


GCCTTTGCCG 


CCGGCACCGC 


CGGTGGCGCC 


GGGGCCGCCG 


ATGCCACCGC 


120 


TTGACCCTGG 


CCGCCGGCGC 


CGCCATTGCC 


ATACAGCACC 


CCGCCGGGGG 


CACCGTTACC 


180 


GCCGTCGCCA 


CCGTCGCCGC 


CGCTGCCGTT 


TCAGGCCGGG 


GAGGCCGAAT 


GAACCGCCGC 


240 


CAAGCCCGCC 


GCCGGCACCG 


TTGCCGCCTT 


TTCCGCCCGC 


CCCGCCGGCG 


CCGCCAATTG 


300 


CCGAACAGCC 


AMGCACCGTT 


GCCGCCAGCC 


CCGCCGCCGT 


TAACGGCGCT 


GCCGGGCGCC 


360 


GCCGCCGGAC 


CCGCCATTAC 


CGCCGTTCCC 


GTTCGGTGCC 


CCGCCGTTAC 


CGGCGCCGCC 


420 


GTTTGCCGCC 


AATATTCGGC 


GGGCACCGCC 


AGACCCGCCG 


GGGCCACCAT 


TGCCGCCGGG 


480 


CACCGAAACA 


ACAGCCCAAC 


GGTGCCGCCG 


GCCCCGCCGT 


TTGCCGCCAT 


CACCGGCCAT 


540 


TCACCGCCAG 


CACCGCCGTT 


AATGTTTATG 


AACCCGGTAC 


CGCCAGCGCG 


GCCCCTATTG 


600 


CCGGGCGCCG 


GAGNGCGTGC 


CCGCCGGCGC 


CGCCAACGCC 


CAAAAGCCCG 


GGGTTGCCAC 


660 


CGGCCCCGCC 


GGACCCACCG 


GTCCCGCCGA 


TCCCCCCGTT 


GCCGCCGGTG 


CCGCCGCCAT 


720 


TGGTGCTGCT 


GAAGCCGTTA 


GCGCCGGTTC 


CGCSGGTTCC 


GGCGGTGGCG 


CCNTGGCCGC 


780 


CGGCCCCGCC 


GTTGCCGTAC 


AGCCACCCCC 


CGGTGGCGCC 


GTTGCCGCCA 


TTGCCGCCAT 


840 


TGCCGCCGTT 


GCCGCCATTG 


CCGCCGTTCC 


CGCCGCCACC 


GCCGGNTTGG 


CCGCCGGCGC 


900 


CGCCGGCGGC 


CGC 










913 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GACTACGTTG GTGTAGAAAA ATCCTGCCGC CCGGACCCTT AAGGCTGGGA CAATTTCTGA 60 
TAGCTACCCC G AC AC AG GAG GTTACGGGAT GAGCAATTCG CGCCGCCGCT CACTCAGGTG 120 
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GTCATGGTTG CTGAGCGTGC TGGCTGCCGT CGGGCTGGGC CTGGCCACGG CGCCGGCCCA 180 

GGCGGCCCCG CCGGCCTTGT CGCAGGACCG GTTCGCCGAC TTCCCCGCGC TGCCCCTCGA 24 0 

CCCGTCCGCG ATGGTCGCCC AAGTGGCGCC ACAGGTGGTC AACATCAACA CCAAACTGGG 300 

CTACAACAAC GCCGTGGGCG CCGGGACCGG CATCGTCATC GATCCCAACG GTGTCGTGCT 3 60 

GACCAACAAC CACGTGATCG CGGGCGCCAC CGACATCAAT GCGTTCAGCG TCGGCTCCGG 4 20 

CCAAACCTAC GGCGTCGATG TGGTCGGGTA TGACCGCACC CAGGATGTCG CGGTGCTGCA 4 80 

GCTGCGCGGT GCCGGTGGCC TGCCGTCGGC GGCGATCGGT GGCGGCGTCG CGGTTGGTGA 54 0 

GCCCGTCGTC GCGATGGGCA ACAGCGGTGG GCAGGGCGGA ACGCCCCGTG CGGTGCCTGG 600 

CAGGGTGGTC GCGCTCGGCC AAACCGTGCA GGCGTCGGAT TCGCTGACCG GTGCCGAAGA 660 

GACATTGAAC GGGTTGATCC AGTTCGATGC CGCAATCCAG CCCGGTGATT CGGGCGGGCC 72 0 

CGTCGTCAAC GGCCTAGGAC AGGTGGTCGG TATGAACACG GCCGCGTCCG ATAACTTCCA 7 80 

GCTGTCCCAG GGTGGGCAGG GATTCGCCAT TCCGATCGGG CAGGCGATGG CGATCGCGGG 84 0 

CCAAATCCGA TCGGGTGGGG GGTCACCCAC CGTTCATATC GGGCCTACCG CCTTCCTCGG 900 

CTTGGGTGTT GTCGACAACA ACGGCAACGG CGCACGAGTC CAACGCGTGG TCGGAAGCGC 9 60 

TCCGGCGGCA AGTCTCGGCA TCTCCACCGG CGACGTGATC ACCGCGGTCG ACGGCGCTCC 1020 

GATCAACTCG GCCACCGCGA TGGCGGACGC GCTTAACGGG CATCATCCCG GTGACGTCAT 1080 

CTCGGTGAAC TGGCAAACCA AGTCGGGCGG CACGCGTACA GGGAACGTGA CATTGGCCGA 114 0 

GGGACCCCCG GCCTGATTTG TCGCGGATAC CACCCGCCGG CCGGCCAATT GGATTGGCGC 1200 

CAGCCGTGAT TGCCGCGTGA GCCCCCGAGT TCCGTCTCCC GTGCGCGTGG CATTGTGGAA 12 60 

GCAATGAACG AGGCAGAACA CAGCGTTGAG CACCCTCCCG TGCAGGGCAG TTACGTCGAA 1320 

GGCGGTGTGG TCGAGCATCC GGATGCCAAG GACTTCGGCA GCGCCGCCGC CCTGCCCGCC 138 0 

GATCCGACCT GGTTTAAGCA CGCCGTCTTC TACGAGGTGC TGGTCCGGGC GTTCTTCGAC 14 4 0 

GCCAGCGCGG ACGGTTCCGN CGATCTGCGT GGACTCATCG ATCGCCTCGA CTACCTGCAG 1500 

TGGCTTGGCA TCGACTGCAT CTGTTGCCGC CGTTCCTACG ACTCACCGCT GCGCGACGGC 1560 

GGTTACGACA TTCGCGACTT CTACAAGGTG CTGCCCGAAT TCGGCACCGT CGACGATTTC 1620 

GTCGCCCTGG TCGACACCGC TCACCGGCGA GGTATCCGCA TCATCACCGA CCTGGTGATG 168 0 

AATCACACCT CGGAGTCGCA CCCCTGGTTT CAGGAGTCCC GCCGCGACCC AGACGGACCG 17 4 0 

TACGGTGACT ATTACGTGTG GAGCGACACC AGCGAGCGCT ACACCGACGC CCGGATCATC 1800 

TTCGTCGACA CCGAAGAGTC GAACTGGTCA TTCGATCCTG TCCGCCGACA GTTNCTACTG 18 60 

GCACCGATTC TT 1872 
(2) INFORMATION FOR SEQ ID NO: 18: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1482 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CTTCGCCGAA ACCTGATGCC GAGGAACAGG GTGTTCCCGT GAGCCCGACG GCGTCCGACC 60 

CCGCGCTCCT CGCCGAGATC AGGCAGTCGC TTGATGCGAC AAAAGGGTTG ACCAGCGTGC 120 

ACGTAGCGGT CCGAACAACC GGGAAAGTCG ACAGCTTGCT GGGTATTACC AGTGCCGATG 18 0 

TCGACGTCCG GGCCAATCCG CTCGCGGCAA AGGGCGTATG CACCTACAAC GACGAGCAGG 24 0 

GTGTCCCGTT TCGGGTACAA GGCGACAACA TCTCGGTGAA ACTGTTCGAC GACTGGAGCA 300 

ATCTCGGCTC GATTTCTGAA CTGTCAACTT CACGCGTGCT CGATCCTGCC GCTGGGGTGA 3 60 

CGCAGCTGCT GTCCGGTGTC ACGAACCTCC AAGCGCAAGG TACCGAAGTG ATAGACGGAA 4 20 

TTTCGACCAC CAAAATCACC GGGACCATCC CCGCGAGCTC TGTCAAGATG CTTGATCCTG 4 80 

GCGCCAAGAG TGCAAGGCCG GCGACCGTGT GGATTGCCCA GGACGGCTCG CACCACCTCG 54 0 

TCCGAGCGAG CATCGACCTC GGATCCGGGT CGATTCAGCT CACGCAGTCG AAATGGAACG 600 

AACCCGTCAA CGTCGACTAG GCCGAAGTTG CGTCGACGCG TTGCTCGAAA CGCCCTTGTG 660 

AACGGTGTCA ACGGCACCCG AAAACTGACC CCCTGACGGC ATCTGAAAAT TGACCCCCTA 720 

GACCGGGCGG TTGGTGGTTA TTCTTCGGTG GTTCCGGCTG GTGGGACGCG GCCGAGGTCG 7 80 

CGGTCTTTGA GCCGGTAGCT GTCGCCTTTG AGGGCGACGA CTTCAGCATG GTGGACGAGG 84 0 

CGGTCGATCA TGGCGGCAGC AACGACGTCG TCGCCGCCGA AAACCTCGCC CCACCGGCCG 900 

AAGGCCTTAT TGGACGTGAC GATCAAGCTG GCCCGCTCAT ACCGGGAGGA CACCAGCTGG 960 

AAGAAGAGGT TGGCGGCCTC GGGCTCAAAC GGAATGTAAC CGACTTCGTC AACCACCAGG 1020 

AGCGGATAGC GGCCAAACCG GGTGAGTTCG GCGTAGATGC GCCCGGCGTG GTGAGCCTCG 1080 

GCGAACCGTG CTACCCATTC GGCGGCGGTG GCGAACAGCA CCCGATGACC GGCCTGACAC 1140 

GCGCGTATCG CCAGGCCGAC CGCAAGATGA GTCTTCCCGG TGCCAGGCGG GGCCCAAAAA 1200 

CACGACGTTA TCGCGGGCGG TGATGAAATC CAGGGTGCCC AGATGTGCGA TGGTGTCGCG 1260 

TTTGAGGCCA CGAGCATGCT CAAAGTCGAA CTCTTCCAAC GACTTCCGAA CCGGGAAGCG 1320 

GGCGGCGCGG ATGCGGCCCT CACCACCATG GGACTCCCGG GCTGACACTT CCCGCTGCAG 1380 

GCAGGCGGCC AGGTATTCTT CGTGGCTCCA GTTCTCGGCG CGGGCGCGAT CGGCCAGCCG 1440 

GGACACTGAC TCACGCAGGG TGGGAGCTTT CAATGCTCTT GT 14 82 
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(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GAATTCGGCA CGAGCCGGCG ATAGCTTCTG GGCCGCGGCC GACCAGATGG CTCGAGGGTT 60 

CGTGCTCGGG GCCACCGCCG GGCGCACCAC CCTGACCGGT GAGGGCCTGC AACACGCCGA 120 

CGGTCACTCG TTGCTGCTGG ACGCCACCAA CCCGGCGGTG GTTGCCTACG ACCCGGCCTT 180 

CGCCTACGAA ATCGGCTACA TCGNGGAAAG CGGACTGGCC AGGATGTGCG GGGAGAACCC 24 0 

GGAGAACATC TTCTTCTACA TCACCGTCTA CAACGAGCCG TACGTGCAGC CGCCGGAGCC 300 

GGAGAACTTC GATCCCGAGG GCGTGCTGGG GGGTATCTAC CGNTATCACG CGGCCACCGA 360 

GCAACGCACC AACAAGGNGC AGATCCTGGC CTCCGGGGTA GCGATGCCCG CGGCGCTGCG 4 20 

GGCAGCACAG ATGCTGGCCG CCGAGTGGGA TGTCGCCGCC GACGTGTGGT CGGTGACCAG 4 80 

TTGGGGCGAG CTAAACCGCG ACGGGGTGGT CATCGAGACC GAGAAGCTCC GCCACCCCGA 54 0 

TCGGCCGGCG GGCGTGCCCT ACGTGACGAG AGCGCTGGAG AATGCTCGGG GCCCGGTGAT 600 

CGCGGTGTCG GACTGGATGC GCGCGGTCCC CGAGCAGATC CGACCGTGGG TGCCGGGCAC 660 

ATACCTCACG TTGGGCACCG ACGGGTTCGG TTTTTCCGAC ACTCGGCCCG CCGGTCGTCG 720 

TTACTTCAAC ACCGACGCCG AATCCCAGGT TGGTCGCGGT TTTGGGAGGG GTTGGCCGGG 780 

TCGACGGGTG AATATCGACC CATTCGGTGC CGGTCGTGGG CCGCCCGCCC AGTTACCCGG 84 0 

ATTCGACGAA GGTGGGGGGT TGCGCCCGAN TAAGTT 87 6 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
ATCCCCCCGG GCTGCAGGAA TTCGGCACGA GAGACAAAAT TCCACGCGTT AATGCAGGAA 60 
CAGATTCATA ACGAATTCAC AGCGGCACAA CAATATGTCG CGATCGCGGT TTATTTCGAC 120 
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AGCGAAGACC 


TGCCGCAGTT 


GGCGAAGCAT 


TTTTACAGCC 


AAGCGGTCGA 


GGAACGAAAC 


180 


CATGCAATGA 


TGCTCGTGCA 


ACACCTGCTC 


GACCGCGACC 


TTCGTGTCGA 


AATTCCCGGC 


240 


GTAGACACGG 


TGCGAAACCA 


GTTCGACAGA 


CCCCGCGAGG 


CACTGGCGCT 


GGCGCTCGAT 


300 


CAGGAACGCA 


CAGTCACCGA 


CCAGGTCGGT 


CGGCTGACAG 


CGGTGGCCCG 


CGACGAGGGC 


360 


GATTTCCTCG 


GCGAGCAGTT 


CATGCAGTGG 


TTCTTGCAGG 


AAC AG AT CG A 


AGAGGTGGCC 


420 


TTGATGGCAA 


CCCTGGTGCG 


GGTTGCCGAT 


CGGGCCGGGG 


CCAACCTGTT 


CGAGCTAGAG 


480 


AACTTCGTCG 


CACGTGAAGT 


GGATGTGGCG 


CCGGCCGCAT 


CAGGCGCCCC 


GCACGCTGCC 


540 


GGGGGCCGCC 


TCTAGATCCC 


TGGGGGGGAT 


CAGCGAGTGG 


TCCCGTTCGC 


CCGCCCGTCT 


600 


TCCAGCCAGG 


CCTTGGTGCG 


GCCGGGGTGG 


TGAGTACCAA 


TCCAGGCCAC 


CCCGACCTCC 


660 


CGGNAAAAGT 


CGATGTCCTC 


GTACTCATCG 


ACGTTCCAGG 


AGTACACCGC 


CCGGCCCTGA 


720 


GCTGCCGAGC 


GGTCAACGAG 


TTGCGGATAT 


TCCTTTAACG 


CAGGCAGTGA 




780 


GCGGTTGGCC 


CGACCGCCGT 


GGCCGCACTG 


CTGGTCAGGT 


ATCGGGGGGT 


CTTGGCGAGC 


840 


AACAACGTCG 


GCAGGAGGGG 


TGGAGCCCGC 


CGGATCCGCA 


GACCGGGGGG 


GCGAAAACGA 


900 


CATCAACACC 


GCACGGGATC 


GATCTGCGGA 


GGGGGGTGCG 


GGAATACCGA 


ACCGGTGTAG 


960 


GAGCGCCAGC 


AGTTGTTTTT 


CCACCAGCGA AGCGTTTTCG 


GGTCATCGGN 


GGCNNTTAAG 


1020 














1021 



T 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CGTGCCGACG AACGGAAGAA C AC AAC CAT G AAGATGGTGA AATCGATCGC CGCAGGTCTG 
ACCGCCGCGG CTGCAATCGG CGCCGCTGCG GCCGGTGTGA CTTCGATCAT GGCTGGCGGN 
CCGGTCGTAT ACCAGATGCA GCCGGTCGTC TTCGGCGCGC CACTGCCGTT GGACCCGGNA 
TCCGCCCCTG ANGTCCCGAC CGCCGCCCAG TGGACCAGNC TGCTCAACAG NCTCGNCGAT 
CCCAACGTGT CGTTTGNGAA CAAGGGNAGT CTGGTCGAGG GNGGNATCGG NGGNANCGAG 300 
GGNGNGNATC GNCGANCACA A 
(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 



60 
120 
180 
240 



321 
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(A) LENGTH: 373 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



TCTTATCGGT 


TCCGGTTGGC 


GACGGGTTTT 


GGGNGCGGGT 


GGTTAACCCG 


CTCGGCCAGC 


60 


CGATCGACGG 


GCGCGGAGAC 


GTCGACTCCG 


ATACTCGGCG 


CGCGCTGGAG 


CTCCAGGCGC 


120 


CCTCGGTGGT 


GNACCGGCAA 


GGCGTGAAGG 


AGCCGTTGNA 


GACCGGGATC 


AAGGCGATTG 


180 


ACGCGATGAC 


CCCGATCGGC 


CGCGGGCAGC 


GCCAGCTGAT 


CATCGGGGAC 


CGCAAGACCG 


240 


GCAAAAACCG 


CCGTCTGTGT 


C G G AC AC CAT 


CCTCAAACCA 


GCGGGAAGAA 


CTGGGAGTCC 


300 


GGTGGATCCC 


AAGAAGCAGG 


TGCGCTTGTG 


TATACGTTGG 


CCATCGGGCA 


AGAAGGGGAA 


360 


CTTACCATCG 


CCG 










373 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



GTGACGCCGT 


GATGGGATTC 


CTGGGCGGGG 


CCGGTCCGCT 


GGCGGTGGTG 


GATCAGCAAC 


60 


TGGTTACCCG 


GGTGCCGCAA 


GGCTGGTCGT 


TTGCTCAGGC 


AGCCGCTGTG 


CCGGTGGTGT 


120 


TCTTGACGGC 


CTGGTACGGG 


TTGGCCGATT 


TAGCCGAGAT 


CAAGGCGGGC 


GAATCGGTGC 


180 


TGATCCATGC 


CGGTACCGGC 


GGTGTGGGCA 


TGGCGGCTGT 


GCAGCTGGCT 


CGCCAGTGGG 


240 


GCGTGGAGGT 


TTTCGTCACC 


GCCAGCCGTG 


GNAAGTGGGA 


CACGCTGCGC 


GCCATNGNGT 


300 


TTGACGACGA 


NCCATATCGG 


NGATTCCCNC 


ACATNCGAAG 


TTCCGANGGA 


GA 


352 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



GAAATCCGCG 


TTCATTCCGT 


TCGACCAGCG 


GCTGGCGATA 


ATCGACGAAG 


TGATCAAGCC 


60 


GCGGTTCGCG 


GCGCTCATGG 


GTCACAGCGA 


GTAATCAGCA AGTTCTCTGG 


TATATCGCAC 


120 


CTAGCGTCCA 


GTTGCTTGCC 


AGATCGCTTT 


CGTACCGTCA 


TCGCATGTAC 


CGGTTCGCGT 


180 


GCCGCACGCT 


CATGCTGGCG 


GCGTGCATCC 


TGGCCACGGG 


TGTGGCGGGT 


CTCGGGGTCG 


240 


GCGCGCAGTC 


CGCAGCCCAA ACCGCGCCGG 


TGCCCGACTA 


CTACTGGTGC 


CCGGGGCAGC 


300 


CTTTCGACCC 


CGCATGGGGG 


CCCAACTGGG 


ATCCCTACAC 


CTGCCATGAC 


GACTTCCACC 


360 


GCGACAGCGA 


CGGCCCCGAC 


CACAGCCGCG 


ACTACCCCGG 


ACCCATCCTC 


GAAGGTCCCG 


/ion 


TGCTTGACGA 


TCCCGGTGCT 


GCGCCGCCGC 


CCCCGGCTGC 


CGGTGGCGGC 


GCATAGCGCT 


480 


CGTTGACCGG 


GCCGCATCAG 


CGAATACGCG 


TATAAACCCG 


GGCGTGCCCC 


CGGCAAGCTA 


540 


CGACCCCCGG 


CGGGGCAGAT 


TTACGCTCCC 


GTGCCGATGG 


ATCGCGCCGT 


CCGATGACAG 


600 


AAAATAGGCG 


ACGGTTTTGG 


CAACCGCTTG 


GAGGACGCTT 


GAAGGGAACC 


TGTCATGAAC 


660 


GGCGACAGCG 


CCTCCACCAT 


CG AC AT CG AC 


AAGGTTGTTA 


CCCGCACACC 


CGTTCGCCGG 


720 


ATCGTG 












726 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 



CGCGACGACG 


ACGAACGTCG 


GGCCCACCAC 


CGCCTATGCG 


TTGATGCAGG 


CGACCGGGAT 


60 


GGTCGCCGAC 


CAT AT C C AAG 


CATGCTGGGT 


GCCCACTGAG 


CGACCTTTTG 


ACCAGCCGGG 


120 


CTGCCCGATG 


GCGGCCCGGT 


GAAGTCATTG 


CGCCGGGGCT 


TGTGCACCTG 


ATGAACCCGA 


180 


ATAGGGAACA 


ATAGGGGGGT 


GATTTGGCAG 


TTCAATGTCG 


GGTATGGCTG 


GAAATCCAAT 


240 


GGCGGGGCAT 


GCTCGGCGCC 


GACCAGGCTC 


GCGCAGGCGG 


GCCAGCCCGA 


ATCTGGAGGG 


300 


AGCACTCAAT 


GGCGGCGATG 


AAGCCCCGGA 


CCGGCGACGG 


TCCTTTGGAA 


GCAACTAAGG 


360 


AGGGGCGCGG 


CATTGTGATG 


CGAGTACCAC 


TTGAGGGTGG 


CGGTCGCCTG 


GTCGTCGAGC 


420 


TGACACCCGA 


CGAAGCCGCC 


GCACTGGGTG 


ACGAACTCAA AGGCGTTACT 


AGCTAAGACC 


480 


AGCCCAACGG 


CGAATGGTCG 


GCGTTACGCG 


CACACCTTCC 


GGTAGATGTC 


CAGTGTCTGC 


540 


TCGGCGATGT 


ATGCCCAGGA 


GAACTCTTGG 


ATACAGCGCT 






580 
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(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
AACGGAGGCG CCGGGGGTTT TGGCGGGGCC GGGGCGGTCG GCGGCAACGG CGGGGCCGGC 60 
GGTACCGCCG GGTTGTTCGG TGTCGGCGGG GCCGGTGGGG CCGGAGGCAA CGGCATCGCC 120 
GGTGTCACGG GTACGTCGGC CAGCACACCG GGTGGATCCG 160 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

GACACCGATA CGATGGTGAT GTACGCCAAC GTTGTCGACA CGCTCGAGGC GTTCACGATC 60 

CAGCGCACAC CCGACGGCGT GACCATCGGC GATGCGGCCC CGTTCGCGGA GGCGGCTGCC 120 

AAGGCGATGG GAATCGACAA GCTGCGGGTA ATTCATACCG GAATGGACCC CGTCGTCGCT 180 

GAACGCGAAC AGTGGGACGA CGGCAACAAC ACGTTGGCGT TGGCGCCCGG TGTCGTTGTC 24 0 

GCCTACGAGC GCAACGTACA GACCAACGCC CG 27 2 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 317 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GCAGCCGGTG GTTCTCGGAC TATCTGCGCA CGGTGACGCA GCGCGACGTG CGCGAGCTGA 60 
AGCGGATCGA GCAGACGGAT CGCCTGCCGC GGTTCATGCG CTACCTGGCC GCTATCACCG 120 
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CGCAGGAGCT GAACGTGGCC GAAGCGGCGC GGGTCATCGG GGTCGACGCG GGGACGATCC 

GTTCGGATCT GGCGTGGTTC GAGACGGTCT ATCTGGTACA TCGCCTGCCC GCCTGGTCGC 

GGAATCTGAC CGCGAAGATC AAGAAGCGGT CAAAGATCCA CGTCGTCGAC AGTGGCTTCG 
CGGCCTGGTT GCGCGGG 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GATCGTGGAG CTGTCGATGA ACAGCGTTGC CGGACGCGCG GCGGCCAGCA CGTCGGTGTA 60 
GCAGCGCCGG ACCACCTCGC CGGTGGGCAG CATGGTGATG ACCACGTCGG CCTCGGCCAC 120 
CGCTTCGGGC GCGCTACGAA ACACCGCGAC ACCGTGCGCG GCGGCGCCGG ACGCCGCCGT 180 
GG 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 308 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



182 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



GATCGCGAAG 


TTTGGTGAGC 


AGGTGGTCGA 


CGCGAAAGTC 


TGGGCGCCTG 


CGAAGCGGGT 


60 


CGGCGTTCAC 


GAGGCGAAGA 


CACGCCTGTC 


CGAGCTGCTG 


CGGCTCGTCT 


ACGGCGGGCA 


120 


GAGGTTGAGA 


TTGCCCGCCG 


CGGCGAGCCG 


GTAGCAAAGC 


TTGTGCCGCT 


GCATCCTCAT 


180 


GAGACTCGGC 


GGTTAGGCAT 


TGACCATGGC 


GTGTACCGCG 


TGCCCGACGA 


TTTGGACGCT 


240 


CCGTTGTCAG 


ACGACGTGCT 


CGAACGCTTT 


CACCGGTGAA 


GCGCTACCTC 


ATCGACACCC 


300 


ACGTTTGG 












308 



(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 67 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

CCGACGACGA GCAACTCACG TGGATGATGG TCGGCAGCGG CATTGAGGAC GGAGAGAATC 60 

CGGCCGAAGC TGCCGCGCGG CAAGTGCTCA TAGTGACCGG CCGTAGAGGG CTCCCCCGAT 120 

GGCACCGGAC TATTCTGGTG TGCCGCTGGC CGGTAAGAGC GGGTAAAAGA ATGTGAGGGG 180 

ACACGATGAG CAATCACACC TACCGAGTGA TCGAGATCGT CGGGACCTCG CCCGACGGCG 24 0 

TCGACGCGGC AATCCAGGGC GGTCTGG 2 67 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1539 base pairs 

(B) TYPE: nucleic. acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

CTCGTGCCGA AAGAATGTGA GGGGACACGA TGAGCAATCA CACCTACCGA GTGATCGAGA 60 

TCGTCGGGAC CTCGCCCGAC GGCGTCGACG CGGCAATCCA GGGCGGTCTG GCCCGAGCTG 120 

CGCAGACCAT GCGCGCGCTG GACTGGTTCG AAGTACAGTC AATTCGAGGC CACCTGGTCG 180 

ACGGAGCGGT CGCGCACTTC CAGGTGACTA TGAAAGTCGG CTTCCGCTGG AGGATTCCTG 24 0 

AACCTTCAAG CGCGGCCGAT AACTGAGGTG CATCATTAAG CGACTTTTCC AGAACATCCT 300 
GACGCGCTCG AAACGCGGTT CAGCCGACGG TGGCTCCGCC GAGGCGCTGC CTCCAAAATC ■ 360 

CCTGCGACAA TTCGTCGGCG GCGCCTACAA GGAAGTCGGT GCTGAATTCG TCGGGTATCT 4 20 

GGTCGACCTG TGTGGGCTGC AGCCGGACGA AGCGGTGCTC GACGTCGGCT GCGGCTCGGG 4 80 

GCGGATGGCG TTGCCGCTCA CCGGCTATCT GAACAGCGAG GGACGCTACG CCGGCTTCGA 54 0 

TATCTCGCAG AAAG C CAT CG CGTGGTGCCA GGAGCACATC ACCTCGGCGC ACCCCAACTT 600 

CCAGTTCGAG GTCTCCGACA TCTACAACTC GCTGTACAAC CCGAAAGGGA AATACCAGTC 660 

ACTAGACTTT CGCTTTCCAT ATCCGGATGC GTCGTTCGAT GTGGTGTTTC TTACCTCGGT 720 

GTTCACCCAC ATGTTTCCGC CGGACGTGGA GCACTATCTG GACGAGATCT CCCGCGTGCT 780 

GAAGCCCGGC GGACGATGCC TGTGCACGTA CTTCTTGCTC AATGACGAGT CGTTAGCCCA 84 0 

CATCGCGGAA GGAAAGAGTG CGCACAACTT CCAGCATGAG GGACCGGGTT ATCGGACAAT 900 



BNSDOCID: <WO 9816646A2_IA> 



WO 98/16646 



PCT/US97/18293 



82 



/~\ /~t TV /~> TV TV TV Tv 

CCACAAGAAG 




A APP AATPPG 


CTTGCCGGAG 


ACCTTCGTCA 


GGGATGTCTA 


960 


nr» /"» r*> tv t\ r~* T> T 1 

TGGCAAG 1 It 


bibib^b- X b*b>b,b,bi 


TPP APPAACC 

1 v^^n.v^ o^i-ri.^ 


ATTGCACTAC 


GGCTCATGGA 


GTGGCCGGGA 


1020 


tv rT~*7\ c r" f T 1 7\ 


/\bjb- 1 I bbribb 


AP ATCGTCAT 


CGCGACCAAA 


ACCGCGAGCT 


AGGTCGGCAT 


1080 


CCGGGAAb>OA 


•p p p p p a p a p p 

1 b*bib^b>.rib-.r-i^Ly 


PTPGCGCCGA 


GCGCCGCTGC 


CGGCAGGCCG 


ATTAGGCGGG 


1140 


/"> TV /"* TV m T» TV 

CAGATTAGCG 


PPPPPPPPPT 
Ubjb^bxbjLrbjbib' J. 




AGTACGGCGC 


CCCGAATGGC 


GTCACCGGCT 


1200 


GG I AALLAL-b 




TGGGCGGCGG 


CCTGCCGGAT 


CAGGTGGTAG 


ATGCCGACAA 


1260 


AGCCTGCGTG 


ATCGGTCATC 


ACCAACGGTG 


ACAGCAGCCG 


GTTG 1 bbALb 


7APPPPPAAPP 
t\\3 O bj b/ bii-irib. bJ 


J. O u 


CCACCCCGGT 


CTCCGGGTCT 


GTCCAGCCGA 


TCGAGCCGCC 


CAAGCCCACA 


TGACCAAACC 


1380 


CCGGCATCAC 


GTTGCCGATC 


GGCATACCGT 


GATAGCCAAG 


ATGAAAATTT 


AAGGGCACCA 


1440 


ATAGATTTCG 


ATCCGGCAGA 


ACTTGCCGTC 


GGTTGCGGGT 


CAGGCCCGTG 


ACCAGCTCCC 


1500 


GCGACAAGAA 


CCGTATGCCG 


TCGATCTCGC 


CTCGTGCCG 






1539 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 851 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



CTGCAGGGTG 


GCGTGGATGA 


GCGTCACCGC 


GGGGCAGGCC 


GAGCTGACCG 


CCGCCCAGGT 


60 


CCGGGTTGCT 


GCGGCGGCCT 


ACGAGACGGC 


GTATGGGCTG 


ACGGTGCCCC 


CGCCGGTGAT 


120 


CGCCGAGAAC 


CGTGCTGAAC 


TGATGATTCT 


GATAGCGACC 


AACCTCTTGG 


GGCAAAACAC 


180 


CCCGGCGATC 


GCGGTCAACG 


AGGCCGAATA 


CGGCGAGATG 


TGGGCCCAAG 


ACGCCGCCGC 


240 


GATGTTTGGC 


TACGCCGCGG 


CGACGGCGAC 


GGCGACGGCG 


ACGTTGCTGC 


CGTTCGAGGA 


300 


GGCGCCGGAG 


ATGACCAGCG 


CGGGTGGGCT 


CCTCGAGCAG 


GCCGCCGCGG 


TCGAGGAGGC 


360 


CTCCGACACC 


GCCGCGGCGA 


ACCAGTTGAT 


GAACAATGTG 


CCCCAGGCGC 


TGAAACAGTT 


420 


GGCCCAGCCC 


ACGCAGGGCA 


CCACGCCTTC 


TTCCAAGCTG 


GGTGGCCTGT 


GGAAGACGGT 


480 


CTCGCCGCAT 


CGGTCGCCGA 


TCAGCAACAT 


GGTGTCGATG 


GCCAACAACC 


AC AT G T CGAT 


540 


GACCAACTCG 


GGTGTGTCGA 


TGACCAACAC 


CTTGAGCTCG 


ATGTTGAAGG 


GCTTTGCTCC 


600 


GGCGGCGGCC 


GCCCAGGCCG 


TGCAAACCGC 


GGCGCAAAAC 


GGGGTCCGGG 


CGATGAGCTC 


660 


GCTGGGCAGC 


TCGCTGGGTT 


CTTCGGGTCT 


GGGCGGTGGG 


GTGGCCGCCA 


ACTTGGGTCG 


720 


GGCGGCCTCG 


GTACGGTATG 


GTCACCGGGA 


TGGCGGAAAA 


TATGCANAGT 


CTGGTCGGCG 


780 
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GAACGGTGGT CCGGCGTAAG GTTTACCCCC GTTTTCTGGA TGCGGTGAAC TTCGTCAACG 



840 



G AAAC AG T T A C 



851 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 254 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GATCGATCGG GCGGAAATTT GGACCAGATT CGCCTCCGGC GATAACCCAA TCAATCGAAC 60 

CTAGATTTAT TCCGTCCAGG GGCCCGAGTA ATGGCTCGCA GGAGAGGAAC CTTACTGCTG 120 

CGGGCACCTG TCGTAGGTCC TCGATACGGC GGAAGGCGTC GACATTTTCC ACCGACACCC 180 

CCATCCAAAC GTTCGAGGGC CACTCCAGCT TGTGAGCGAG GCGACGCAGT CGCAGGCTGC 24 0 

GCTTGGTCAA GATC 2 54 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



GATCCTGACC 


GAAGCGGCCG 


CCGCCAAGGC 


GAAGTCGCTG 


TTGGACCAGG 


AGGGACGGGA 


60 


CGATCTGGCG 


CTGCGGATCG 


CGGTTCAGCC 


GGGGGGGTGC 


GCTGGATTGC 


GCTATAACCT 


120 


TTTCTTCGAC 


GACCGGACGC 


TGGATGGTGA 


CCAAACCGCG 


GAGTTCGGTG 


GTGTCAGGTT 


180 


GATCGTGGAC 


CGGATGAGCG 


CGCCGTATGT 


GGAAGGCGCG 


TCGATCGATT 


TCGTCGACAC 


240 


TATTGAGAAG 


CAAGGTTCAC 


CATCGACAAT 


CCCAACGCCA 


CCGGCTCCTG 


CGCGTGCGGG 


300 


GATTCGTTCA 


ACTGATAAAA 


CGCTAGTACG 


ACCCCGCGGT 


GCGCAACACG 


TACGAGCACA 


360 


CCAAGACCTG 


ACCGCGCTGG' 


AAAAGCAACT 


GAGCGATGCC 


TTGCACCTGA 


CCGCGTGGCG 


420 


GGCCGCCGGC 


GGCAGGTGTC 


ACCTGCATGG 


TGAACAGCAC 


CTGGGCCTGA 


TATTGCGACC 


480 


AGTACACGAT 


TTTGTCGATC 


GAGGTCACTT 


CGACCTGGGA 


GAACTGCTTG 


CGGAACGCGT 


540 


CGCTGCTCAG 


CTTGGCCAAG 


GCCTGATCGG 


AGCGCTTGTC 


GCGCACGCCG 


TCGTGGATAC 


600 
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CGCACAGCGC 


ATTGCGAACG 


ATGGTGTCCA 


CATCGCGGTT 




TTGAGGTATC 


660 


CCTGAATCGC 


GGTTTTGGCC 


GGTCCCTCCG 


AGAATGTGCC 


1 LlL, X W J- X ^ 


GCTCCGTTGG 


720 


TGCGGACCCC 


GTATATGATC 


GCCGCCGTCA 


TAGCCGACAC 


LHb ^bi»» Vjj.rt.vj 0 


GCTACCACAA 


780 


TGCCGATCAG 


CAGCCGCTTG 


TGCCGTCGCT 


TCGGGTAC3L1A 


p z\ r p t f; r* G c 


GGCACGCCGG 


840 


GATATGCGGC 


GGGCGGCAGC 


GCCGCGTCGT 


CTGCCGGTCC 


r- r r rz czc n 7\ A 


GCCGGTTCGG 


900 


CGGCGCCGAG 


GTCGTGGGGG 


TAGTCCAGGG 


CTTGGGGTTC 




f^CTCGGGGT 

OOb X V^\JVJ\JJ\J J. 


960 


ACGGCGCCGG 


TCCGTTGGTG 


CCGACACCGG 


GGTTCGGCGA 


O X bboun^^u 


GGCATTGTGG 


1020 


TTCTCCTAGG 


GTGGTGGACG 


GGACCAGCTG 


CTAGGGCGAC 


AACCGCCCGT 


CGCGTCAGCC 


1080 


GGC AG CAT CG 


GCAATCAGGT 


GAGCTCCCTA 


GGCAGGCTAG 


CGCAACAGCT 


GCCGTCAGCT 


1140 


CTCAACGCGA 


CGGGGCGGGC 


CGCGGCGCCG 


ATAATGTTGA 


AAGACTAGGC 


AACCTTAGGA 


1200 






GACGATC 








1227 


ACGAAGGACG 


GAGATTTTGT 









(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGGGCGGGC GGGGCCGGCG 
GGACCGGCGC TAACGGTGGT GCCGGCGGCA ACGCCTGGTT GTTCGGGGCC GGCGGGTCCG 
GCGGNGCCGG CACCAATGGT GGNGTCGGCG GGTCCGGCGG ATTTGTCTAC GGCAACGGCG 



G 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



60 
120 
180 
181 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGTGTCGGC GGCCGGGGCG 60 
GCGACGGCGT CTTTGCCGGT GCCGGCGGCC AGGGCGGCCT CGGTGGGCAG GGCGGCAATG 



120 
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GCGGCGGCTC CACCGGCGGC AACGGCGGTC TTGGCGGCGC GGGCGGTGGC GGAGGCAACG 



180 



CCCCGGACGG CGGCTTCGGT GGCAACGGCG GTAAGGGTGG CCAGGGCGGN ATTGGCGGCG 



240 



GCACTCAGAG CGCGACCGGC CTCGGNGGTG ACGGCGGTGA CGGCGGTGAC 



290 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
GATCCAGTGG CATGGNGGGT GTCAGTGGAA GCAT 34 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GATCGCTGCT CGTCCCCCCC TTGCCGCCGA CGCCACCGGT CCCACCGTTA CCGAACAAGC 60 
TGGCGTGGTC GCCAGCACCC CCGGCACCGC CGACGCCGGA GTCGAACAAT GGCACCGTCG 120 
TATCCCCACC ATTGCCGCCG GNCCCACCGG CACCG 155 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

ATGGCGTTCA CGGGGCGCCG GGGACCGGGC AGCCCGGNGG GGCCGGGGGG TGG 53 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GATCCACCGC GGGTGCAGAC GGTGCCCGCG GCGCCACCCC GACCAGCGGC GGCAACGGCG 
GCACCGGCGG CAACGGCGCG AACGCCACCG TCGTCGGNGG GGCCGGCGGG GCCGGCGGCA 
AGGGCGGCAA CG 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GATCGGCGGC CGGNACGGNC GGGGACGGCG GCAAGGGCGG NAACGGGGGC GCCGNAGCCA 
CCNGCCAAGA ATCCTCCGNG TCCNCCAATG GCGCGAATGG CGGACAGGGC GGCAACGGCG 
GCANCGGCGG CA 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 
CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC 
ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTTCAGT TTAGCGACGA TAATGGCTAT 
AGCACTAAGG AG GAT GAT C C GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 
AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTCC 
CCATCACACC GTGCGAACTC ACGGNGGNTA AAAACGCCGC CCAACAGNTG GTNTTGTCCG 
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CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 4 20 

CGCTGCGCAA CGCGGCCAAG GNGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 4 80 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 54 0 

CGGCCGAACT AACCGATACG CCGAGGGTGG CCACGGCCGG TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAAGG CGCATCGCTC GCGCACTGNG 660 
GGGATGGGTG GAACACTTNC ACCCTGACGC TGCAAGGCGA CG 7 02 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 4 : 

GAAGCCGCAG CGCTGTCGGG CGACGTGGCG GTCAAAGCGG CATCGCTCGG TGGCGGTGGA 60 

GGCGGCGGGG TGCCGTCGGC GCCGTTGGGA TCCGCGATCG GGGGCGCCGA ATCGGTGCGG 120 

CCCGCTGGCG CTGGTGACAT TGCCGGCTTA GGCCAGGGAA GGGCCGGCGG CGGCGCCGCG 180 

CTGGGCGGCG GTGGCATGGG AATGCCGATG GGTGCCGCGC ATCAGGGACA AGGGGGCGCC 24 0 

AAGTCCAAGG GTTCTCAGCA GGAAGACGAG GCGCTCTACA CCGAGGATCC TCGTGCCG 2 98 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

CGGCACGAGG ATCGAATCGC GTCGCCGGGA GCACAGCGTC GCACTGCACC AGTGGAGGAG 60 

CCATGACCTA CTCGCCGGGT AACCCCGGAT ACCCGCAAGC GCAGCCCGCA GGCTCCTACG 120 

GAGGCGTCAC ACCCTCGTTC GCCCACGCCG ATGAGGGTGC GAGCAAGCTA CCGATGTACC 180 

TGAACATCGC GGTGGCAGTG CTCGGTCTGG CTGCGTACTT CGCCAGCTTC GGCCCAATGT 24 0 

TCACCCTCAG TACCGAACTC GGGGGGGGTG ATGGCGCAGT GTCCGGTGAC ACTGGGCTGC 300 

CGGTCGGGGT GGCTCTGCTG GCTGCGCTGC TTGCCGGGGT GGTTCTGGTG CCTAAGGCCA 360 
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AGAGCCATGT GACGGTAGTT GCGGTGCTCG GGGTACTCGG CGTATTTCTG ATGGTCTCGG 4 20 

CGACGTTTAA CAAGCCCAGC GCCTATTCGA CCGGTTGGGC ATTGTGGGTT GTGTTGGCTT 4 80 

TCATCGTGTT CCAGGCGGTT GCGGCAGTCC TGGCGCTCTT GGTGGAGACC GGCGCTATCA 54 0 

CCGCGCCGGC GCCGCGGCCC AAGTTCGACC CGTATGGACA GTACGGGCGG TACGGGCAGT 600 

ACGGGCAGTA CGGGGTGCAG CCGGGTGGGT ACTACGGTCA GCAGGGTGCT CAGCAGGCCG 660 

CGGGACTGCA GTCGCCCGGC CCGCAGCAGT CTCCGCAGCC TCCCGGATAT GGGTCGCAGT 72 0 

ACGGCGGCTA TTCGTCCAGT CCGAGCCAAT CGGGCAGTGG ATACACTGCT CAGCCCCCGG 7 80 

CCCAGCCGCC GGCGCAGTCC GGGTCGCAAC AATCGCACCA GGGCCCATCC ACGCCACCTA 84 0 

CCGGCTTTCC GAGCTTCAGC CCACCACCAC CGGTCAGTGC CGGGACGGGG TCGCAGGCTG 900 

GTTCGGCTCC AGTCAACTAT TCAAACCCCA GCGGGGGCGA GCAGTCGTCG TCCCCCGGGG 960 

GGGCGCCGGT CTAACCGGGC GTTCCCGCGT CCGGTCGCGC GTGTGCGCGA AGAGTGAACA 1020 

GGGTGTCAGC AAGCGCGGAC GATCCTCGTG CCGAATTC 1058 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 



CGGCACGAGA 


GACCGATGCC 


GCTACCCTCG 


CGCAGGAGGC 


AGGTAATTTC 


GAGCGGATCT 


60 


CCGGCGACCT 


GAAAACCCAG 


AT C G AC C AG G 


TGGAGTCGAC 


GGCAGGTTCG 


TTGCAGGGCC 


120 


AGTGGCGCGG 


CGCGGCGGGG 


ACGGCCGCCC 


AGGCCGCGGT 


GGTGCGCTTC 


CAAGAAGCAG 


180 


CCAATAAGCA 


GAAGCAGGAA 


CTCGACGAGA 


TCTCGACGAA 


TATTCGTCAG 


GCCGGCGTCC 


240 


AATACTCGAG 


GGCCGACGAG 


GAGCAGCAGC 


AGGCGCTGTC 


CTCGCAAATG 


GGCTTCTGAC 


300 


CCGCTAATAC 


GAAAAGAAAC 


GGAGCAA 








327 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



BNSDOCID: <WO 9816646A2JA> 



WO 98/16646 



PCT/US97/18293 



89 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 



CGGTCGCGAT GATGGCGTTG TCGAACGTGA CCGATTCTGT ACCGCCGTCG TTGAGATCAA 



60 



CCAACAACGT GTTGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTGTTCT 



120 



TCTTCATCAG GAAGTGCACA CCGGCCACCC TGCCCTCGGN TACCTTTCGG 



170 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
GATCCGGCGG CACGGGGGGT GCCGGCGGCA GCACCGCTGG CGCTGGCGGC AACGGCGGGG 60 
CCGGGGGTGG CGGCGGAACC GGTGGGTTGC TCTTCGGCAA CGGCGGTGCC GGCGGGCACG 120 
GGGCCGT 12 7 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
CGGCGGCAAG GGCGGCACCG CCGGCAACGG GAGCGGCGCG GCCGGCGGCA ACGGCGGCAA 60 
CGGCGGCTCC GGCCTCAACG G 81 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GATCAGGGCT GGCCGGCTCC GGCCAGAAGG GCGGTAACGG AGGAGCTGCC GGATTGTTTG 60 
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GCAACGGCGG GGCCGGNGGT GCCGGCGCGT CCAACCAAGC CGGTAACGGC GGNGCCGGCG 
GAAACGGTGG TGCCGGTGGG CTGATCTGG 
(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CGGCACGAGA TCACACCTAC CGAGTGATCG AGATCGTCGG GACCTCGCCC GACGGTGTCG 60 

ACGCGGNAAT CCAGGGCGGT CTGGCCCGAG CTGCGCAGAC CATGCGCGCG CTGGACTGGT 120 

TCGAAGTACA GTCAATTCGA GGCCACCTGG TCGACGGAGC GGTCGCGCAC TTCCAGGTGA 180 

CTATGAAAGT CGGCTTCCGC CTGGAGGATT CCTGAACCTT CAAGCGCGGC CGATAACTGA 24 0 

GGTGCATCAT TAAGCGACTT TTCCAGAACA TCCTGACGCG CTCGAAACGC GGTTCAGCCG 300 

ACGGTGGCTC CGCCGAGGCG CTGCCTCCAA AATCCCTGCG ACAATTCGTC GGCGG 355 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



ATGCATCACC 


AT C AC CAT C A 


CAT G CATC AG 


GTGGACCCCA 


ACTTGACACG 


TCGCAAGGGA 


60 


CGATTGGCGG 


CACTGGCTAT 


CGCGGCGATG 


GCCAGCGCCA 


GCCTGGTGAC 


CGTTGCGGTG 


120 


CCCGCGACCG 


CCAACGCCGA 


TCCGGAGCCA 


GCGCCCCCGG 


TACCCACAAC 


GGCCGCCTCG 


180 


CCGCCGTCGA 


CCGCTGCAGC 


GCCACCCGCA 


CCGGCGACAC 


CTGTTGCCCC 


CCCACCACCG 


240 


GCCGCCGCCA 


ACACGCCGAA 


TGCCCAGCCG 


GGCGATCCCA 


AC.GCAGCACC 


TCCGCCGGCC 


300 


GACCCGAACG 


CACCGCCGCC 


ACCTGTCATT 


GCCCCAAACG 


CACCCCAACC 


TGTCCGGATC 


360 


GACAACCCGG 


TTGGAGGATT 


CAGCTTCGCG 


CTGCCTGCTG 


GCTGGGTGGA 


GTCTGACGCC 


420 


GCCCACTTCG 


ACTACGGTTC 


AGCACTCCTC 


AG C AAAAC C A 


CCGGGGACCC 


GCCATTTCCC 


480 


GGACAGCCGC 


CGCCGGTGGC 


CAATGACACC 


CGTATCGTGC 


TCGGCCGGCT 


AGACCAAAAG 


540 
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CTTTACGCCA GCGCCGAAGC CACCGACTCC AAGGCCGCGG CCCGGTTGGG CTCGGACATG 600 

GGTGAGTTCT ATATGCCCTA CCCGGGCACC CGGATCAACC AGGAAACCGT CTCGCTCGAC 660 

GCCAACGGGG TGTCTGGAAG CGCGTCGTAT TACGAAGTCA AGTTCAGCGA TCCGAGTAAG 720 

CCGAACGGCC AGATCTGGAC GGGCGTAATC GGCTCGCCCG CGGCGAACGC ACCGGACGCC 7 80 

GGGCCCCCTC AGCGCTGGTT TGTGGTATGG CTCGGGACCG CCAACAACCC GGTGGACAAG 84 0 

GGCGCGGCCA AGGCGCTGGC CGAATCGATC CGGCCTTTGG TCGCCCCGCC GCCGGCGCCG 900 

GCACCGGCTC CTGCAGAGCC CGCTCCGGCG CCGGCGCCGG CCGGGGAAGT CGCTCCTACC 960 

CCGACGACAC CGACACCGCA GCGGACCTTA CCGGCCTGA 9 99 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Met His His His His His His Met His Gin Val Asp Pro Asn Leu Thr 
1 5 10 15 

Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala lie Ala Ala Met Ala Ser 
20 25 30 

Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro 
35 40 45 

Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr 
50 55 60 

Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro 
65 70 75 80 

Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala 
85 90 95 

Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val lie Ala Pro 
100 105 110 

Asn Ala Pro Gin Pro Val Arg lie Asp Asn Pro Val Gly Gly Phe Ser 
115 120 125 

Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp 
130 135 140 

Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe Pro 
145 150 155 160 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg lie Val Leu Gly Arg 
165 170 175 
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Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala 
180 " 185 190 

Ala Ala Arq Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 
195 J 200 205 

Glv Thr Arq He Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val 
210 215 220 

Ser Glv Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys 
225 230 235 240 

Pro Asn Gly Gin He Trp Thr Gly Val He Gly Ser Pro Ala Ala Asn 
245 250 255 

Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly 
260 265 270 

Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu 
275 280 285 

Ser He Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro 
290 295 300 

Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr 
305 310 315 320 

Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 
325 330 

(2) INFORMATION FOR SEQ ID NO: 54 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Asp Pro Val Asp Ala Val He Asn Thr Thr Xaa Asn Tyr Gly Gin Val 
15 10 15 

Val Ala Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
1 5 10 15 

Glu Gly Arg 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
1 5 10 ' " 15 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 59: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Ala Glu Glu Ser lie Ser Thr Xaa Glu Xaa lie Val Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 
15 10 15 

Ala 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr Ser 
15 10 15 

Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 187 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Thr Gly Ser Leu Asn Gin Thr His Asn Arg Arg Ala Asn Glu Arg Lys 
15 10 15 

Asn Thr Thr Met Lys Met Val Lys Ser lie Ala Ala Gly Leu Thr Ala 
20 25 30 

Ala Ala Ala lie Gly Ala Ala Ala Ala Gly Val Thr Ser lie Met Ala 
35 40 45 

Gly Gly Pro Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro 
50 55 60 

Leu Pro Leu Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin 
65 70 75 80 

Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala 
85 , 90 95 

Asn Lys Gly Ser Leu Val Glu Gly Gly lie Gly Gly Thr Glu Ala Arg 
100 105 ~ 110 

lie Ala Asp His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro 
115 120 125 

Leu Ser Phe Ser Val Thr Asn lie Gin Pro Ala Ala Ala Gly Ser Ala 
130 135 140 

Thr Ala Asp Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr 
145 150 155 160 

Gin Asn Val Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala 
165 170 175 

Ser Ala Met Glu Leu Leu Gin Ala Ala Gly Xaa 
180 185 

(2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Asp Glu Val Thr Val Glu Thr Thr Ser Val Phe Arg Ala Asp Phe Leu 
15 10 15 

Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Ser 
20 25 30 

Gly Val Glu Gly Leu Pro Pro Gly Ser Ala Leu Leu Val Val Lvs Ara 
35 40 45 

Gly Pro Asn Ala Gly Ser Arg Phe Leu Leu Asp Gin Ala He Thr Ser 
50 55 60 

Ala Gly Arg His Pro Asp Ser Asp He Phe Leu Asp Asp Val Thr Val 
65 7 ° 75 so 

Ser Arg Arg His Ala Glu Phe Arg Leu Glu Asn Asn Glu Phe Asn Val 
85 90 95 

Val Asp Val Gly Ser Leu Asn Gly Thr Tyr Val Asn Arg Glu Pro Val 
100 105 no 

Asp Ser Ala Val Leu Ala Asn Gly Asp Glu Val Gin He Gly Lys Leu 
115 120 125 

Arg Leu Val Phe Leu Thr Gly Pro Lys Gin Gly Glu Asp Asp Glv Ser 
130 135 140 

Thr Gly Gly Pro 
145 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Thr Ser Asn Arg Pro Ala Arg Arg Gly Arg Arg Ala Pro Arg Asp Thr 
1 5 10 15 

Gly Pro Asp Arg Ser Ala Ser Leu Ser Leu Val Arg His Arg Arq Gin 
20 25 30 
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Gin Arg Asp Ala Leu Cys Leu Ser Ser Thr Gin lie Ser Arg Gin Ser 
35 4 0 4 5 

Asn Leu Pro Pro Ala Ala Gly Gly Ala Ala Asn Tyr Ser Arg Arg Asn 
50 55 60 

Phe Asp Val Arg lie Lys lie Phe Met Leu Val Thr Ala Val Val Leu 
65 70 75 80 

Leu Cys Cys Ser Gly Val Ala Thr Ala Ala Pro Lys Thr Tyr Cys Glu 
85 90 95 

Glu Leu Lys Gly Thr Asp Thr Gly Gin Ala Cys Gin lie Gin Met Ser 
100 105 110 

Asp Pro Ala Tyr Asn lie Asn lie Ser Leu Pro Ser Tyr Tyr Pro Asp 
115 120 125 

Gin Lys Ser Leu Glu Asn Tyr lie Ala Gin Thr Arg Asp Lys Phe Leu 
130 135 140 

Ser Ala Ala Thr Ser Ser Thr Pro Arg Glu Ala Pro Tyr Glu Leu Asn 
145 150 155 " 160 

lie Thr Ser Ala Thr Tyr Gin Ser Ala lie Pro Pro Arg Gly Thr Gin 
165 170 175 

Ala Val Val Leu Xaa Val Tyr His Asn Ala Gly Gly Thr His Pro Thr 
180 185 ^ 190 

Thr Thr Tyr Lys Ala Phe Asp Trp Asp Gin Ala Tyr Arg Lys Pro lie 
195 200 " 205 

Thr Tyr Asp Thr Leu Trp Gin Ala Asp Thr Asp Pro Leu Pro Val Val 
210 215 220 

Phe Pro lie Val Ala Arg 
225 230 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe 

1 5 10 15 

Ala lie Pro lie Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser 

20 25 30 

Gly Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly 
35 40 45 
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Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arq Val 
50 55 60 

Val Gly Ser Ala Pro Ala Ala Ser Leu Gly lie Ser Thr Gly Asp Val 
65 70 75 " 80 

He Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala 
85 90 95 

Asp Ala Leu Asn Gly His His Pro Gly Asp Val He Ser Val Asn Trp 
100 105 HO 

Gin Thr Lys Ser Gly Gly Thr Arg ' Thr Gly Asn Val Thr Leu Ala Glu 
115 120 125 

Gly Pro Pro Ala 
130 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Val Pro Leu Arg Ser Pro Ser Met Ser Pro Ser Lys Cys Leu Ala Ala 
1 5 10 ' 15 

Ala Gin Arg Asn Pro Val He Arg Arg Arg Arg Leu Ser Asn Pro Pro 
20 25 30 

Pro Arg Lys Tyr Arg Ser Met Pro Ser Pro Ala Thr Ala Ser Ala Gly 
35 40 45 

Met Ala Arg Val Arg Arg Arg Ala He Trp Arg Gly Pro Ala Thr Xaa 
50 55 60 

Ser Ala Gly Met Ala Arg Val Arg Arg Trp Xaa Val Met Pro Xaa Val 
65 70 75 80 

He Gin Ser Thr Xaa He Arg Xaa Xaa Gly Pro Phe Asp Asn Arg Gly 
85 90 " 95 

Ser Glu Arg Lys 
100 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Met Thr Asp Asp He Leu Leu He Asp Thr Asp Glu Arg Val Arg Thr 
15 10 15 

Leu Thr Leu Asn Arg Pro Gin Ser Arg Asn Ala Leu Ser Ala Ala Leu 
20 25 30 

Arg Asp Arg Phe Phe Ala Xaa Leu Xaa Asp Ala Glu Xaa Asp Asp Asp 
35 4 0 4 5 

He Asp Val Val He Leu Thr Gly Ala Asp Pro Val Phe Cys Ala Gly 
50 55 60 

Leu Asp Leu Lys Val Ala Gly Arg Ala Asp Arg Ala Ala Gly His Leu 
65 70 75 - 80 

Thr Ala Val Gly Gly His Asp Gin Ala Gly Asp Arg Arg Asp Gin Arg 
8 5 90 " 95 

Arg Arg Gly His Arg Arg Ala Arg Thr Gly Ala Val Leu Arg His Pro 
100 105 110 

Asp Arg Leu Arg Ala Arg Pro Leu Arg Arg His Pro Arg Pro Gly Gly 
115 120 125 

Ala Ala Ala His Leu Gly Thr Gin Cys Val Leu Ala Ala Lys Gly Arg 
130 135 14 0 

His Arg Xaa Gly Pro Val Asp Glu Pro Asp Arg Arg Leu Pro Val Arg 
145 150 155 160 

Asp Arg Arg 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Met Lys Phe Val Asn His He Glu Pro Val Ala Pro Arg Arg Ala Gly 

1 5 10 15 

Gly Ala Val Ala Glu Val Tyr Ala Glu Ala Arg Arg Glu Phe Gly Arg 
20 25 " 30 

Leu Pro Glu Pro Leu Ala Met Leu Ser Pro Asp Glu Gly Leu Leu Thr 
35 40 45 
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Ala Gly Trp Ala Thr Leu Arg Glu Thr Leu Leu Val Gly Gin Val Pro 
50 55 60 

Arg Gly Arg Lys Glu Ala Val Ala Ala Ala Val Ala Ala Ser Leu Arg 
65 70 75 80 

Cys Pro Trp Cys Val Asp Ala His Thr Thr Met Leu Tyr Ala Ala Gly 
85 90 95 

Gin Thr Asp Thr Ala Ala Ala lie Leu Ala Gly Thr Ala Pro Ala Ala 
100 105 110 

Gly Asp Pro Asn Ala Pro Tyr Val Ala Trp Ala Ala Gly Thr Gly Thr 
115 120 125 

Pro Ala Gly Pro Pro Ala Pro Phe Gly Pro Asp Val Ala Ala Glu Tyr 
130 " 135 140 

Leu Gly Thr Ala Val Gin Phe His Phe lie Ala Arg Leu Val Leu Val 
145 150 155 160 

Leu Leu Asp Glu Thr Phe Leu Pro Gly Gly Pro Arg Ala Gin Gin Leu 
165 170 175 

Met Arg Arg Ala Gly Gly Leu Val Phe Ala Arg Lys Val Arg Ala Glu 
180 185 190 

His Arg Pro Gly Arg Ser Thr Arg Arg Leu Glu Pro Arg Thr Leu Pro 
195 ~ 200 205 

Asp Asp Leu Ala Trp Ala Thr Pro Ser Glu Pro lie Ala Thr Ala Phe 
210 215 220 

Ala Ala Leu Ser His His Leu Asp Thr Ala Pro His Leu Pro Pro Pro 
225 230 235 240 

Thr Arg Gin Val Val Arg Arg Val Val Gly Ser Trp His Gly Glu Pro 
245 250 255 

Met Pro Met Ser Ser Arg Trp Thr Asn Glu His Thr Ala Glu Leu Pro 
260 265 270 

Ala Asp Leu His Ala Pro Thr Arg Leu Ala Leu Leu Thr Gly Leu Ala 
275 280 285 

Pro His Gin Val Thr Asp Asp Asp Val Ala Ala Ala Arg Ser Leu Leu 
290 295 300 

Asp Thr Asp Ala Ala Leu Val Gly Ala Leu Ala Trp Ala Ala Phe Thr 
305 310 315 320 

Ala Ala Arg Arg lie Gly Thr Trp lie Gly Ala Ala Ala Glu Gly Gin 
325 330 335 

Val Ser Arg Gin Asn Pro Thr Gly 
340 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 85 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Asp Asp Pro Asp Met Pro Gly Thr Val Ala Lys Ala Val Ala Asp Ala 
15 10 15 

Leu Gly Arg Gly lie Ala Pro Val Glu Asp lie Gin Asp Cys Val Glu 
20 25 30 

Ala Arg Leu Gly Glu Ala Gly Leu Asp Asp Val Ala Arg Val Tyr lie 
35 40 45 

lie Tyr Arg Gin Arg Arg Ala Glu Leu Arg Thr Ala Lys Ala Leu Leu 
50 55 60 

Gly Val Arg Asp Glu Leu Lys Leu Ser Leu Ala Ala Val Thr Val Leu 
65 70 75 80 

Arg Glu Arg Tyr Leu Leu His Asp Glu Gin Gly Arg Pro Ala Glu Ser 
85 90 95 

Thr Gly Glu Leu Met Asp Arg Ser Ala Arg Cys Val Ala Ala Ala Glu 
100 105 110 

Asp Gin Tyr Glu Pro Gly Ser Ser Arg Arg Trp Ala Glu Arg Phe Ala 
115 120 125 

Thr Leu Leu Arg Asn Leu Glu Phe Leu Pro Asn Ser Pro Thr Leu Met 
130 135 140 

Asn Ser Gly Thr Asp Leu Gly Leu Leu Ala Gly Cys Phe Val Leu Pro 
-145 150 155 160 

lie Glu Asp Ser Leu Gin Ser lie Phe Ala Thr Leu Gly Gin Ala Ala 
165 170 175 

Glu Leu Gin Arg Ala Gly Gly Gly Thr Gly Tyr Ala Phe Ser His Leu 
180 185 190 

Arg Pro Ala Gly Asp Arg Val Ala Ser Thr Gly Gly Thr Ala Ser Gly 
195 200 " " 205 

Pro Val Ser Phe Leu Arg Leu Tyr Asp Ser Ala Ala Gly Val Val Ser 
210 215 220 

Met Gly Gly Arg Arg Arg Gly Ala Cys Met Ala Val Leu Asp Val Ser 
225 230 235 240 

His Pro Asp lie Cys Asp Phe Val Thr Ala Lys Ala Glu Ser Pro Ser 
245 250 255 

Glu Leu Pro His Phe Asn Leu Ser Val Gly Val Thr Asp Ala Phe Leu 
260 265 " * 270 

Arg Ala Val Glu Arg Asn Gly Leu His Arg Leu Val Asn Pro Arg Thr 
275 280 "" 285 
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Gly Lys lie Val Ala Arg Met Pro Ala Ala Glu Leu Phe Asp Ala lie 
290 295 300 

Cys Lys Ala Ala His Ala Gly Gly Asp Pro Gly Leu Val Phe Leu Asp 
305 310 315 320 

Thr lie Asn Arg Ala Asn Pro Val Pro Gly Arg Gly Arg lie Glu Ala 
325 330 335 

Thr Asn Pro Cys Gly Glu Val Pro Leu Leu Pro Tyr Glu Ser Cys Asn 
340 345 350 

Leu Gly Ser lie Asn Leu Ala Arg Met Leu Ala Asp Gly Arg Val Asp 
355 360 365 

Trp Asp Arg Leu Glu Glu Val Ala Gly Val Ala Val Arg Phe Leu Asp 
370 " 375 380 

Asp Val lie Asp Val Ser Arg Tyr Pro Phe Pro Glu Leu Gly Glu Ala 
385 390 395 400 

Ala Arg Ala Thr Arg Lys lie Gly Leu Gly Val Met Gly Leu Ala Glu 
4 05 410 415 

Leu Leu Ala Ala Leu Gly lie Pro Tyr Asp Ser Glu Glu Ala Val Arg 
420 425 430 

Leu Ala Thr Arg Leu Met Arg Arg lie Gin Gin Ala Ala His Thr Ala 
435 440 445 

Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ala Phe Thr Asp 
4 50 4 55 4 60 

Ser Arg Phe Ala Arg Ser Gly Pro Arg Arg Asn Ala Gin Val Thr Ser 
465 " 470 475 480 

Val Ala Pro Thr Gly 
485 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: . 

Gly Val lie Val Leu Asp Leu Glu Pro Arg Gly Pro Leu Pro Thr Glu 

15 10 15 

lie Tyr Trp Arg Arg Arg Gly Leu Ala Leu Gly lie Ala Val Val Val 

20 " 25 30 

Val Gly lie Ala Val Ala lie Val lie Ala Phe Val Asp Ser Ser Ala 

35 40 45 
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Gly Ala Lys Pro Val Ser Ala Asp Lys Pro Ala Ser Ala Gin Ser His 
50 55 60 

Pro Gly Ser Pro Ala Pro Gin Ala Pro Gin Pro Ala Gly Gin Thr Glu 
65 70 75 80 

Gly Asn Ala Ala Ala Ala Pro Pro Gin Gly Gin Asn Pro Glu Thr Pro 
85 90 95 

Thr Pro Thr Ala Ala Val Gin Pro Pro Pro Val Leu Lys Glu Gly Asp 
100 105 " 110 

Asp Cys Pro Asp Ser Thr Leu Ala Val Lys Gly Leu Thr Asn Ala Pro 
115 120 125 

Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met Val Val Thr Asn 
130 135 140 

lie Gly Leu Val Ser Cys Lys Arg Asp Val Gly Ala Ala Val Leu Ala 
145 150 155 160 

Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg Leu Trp Ser Asn Leu Asp 
165 170 175 

Cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly Glu 
180 185 190 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 
195 200 " ' 205 

Cys Pro Leu Pro Arg Pro Ala lie Gly Pro Gly Thr Tyr Asn Leu Val 
210 215 220 

Val Gin Leu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe lie Leu Asn 
225 230 235 240 

Gin Pro Pro Pro Pro Pro Gly Pro Val Pro Ala Pro Gly Pro Ala Gin 
245 250 ~ 255 

Ala Pro Pro Pro Glu Ser Pro Ala Gin Gly Gly 
260 265 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Leu lie Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly Val Gin Val 
15 10 15 

Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys lie Val Glu Val Val Ala 
20 25 30 
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Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val Val Val Thr 
35 40 45 

Lys Val Asp Asp Arg Pro He Asn Ser Ala Asp Ala Leu Val Ala Ala 
50 55 60 

Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr Phe Gin Asp 
65 70 75 80 

Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly Lys Ala Glu 
85 90 95 

Gin 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 364 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Gly Ala Ala Val Ser Leu Leu Ala Ala Gly Thr Leu Val Leu Thr Ala 
1 5 10 15 

Cys Gly Gly Gly Thr Asn Ser Ser Ser Ser Gly Ala Gly Gly Thr Ser 
20 25 30 

Gly Ser Val His Cys Gly Gly Lys Lys Glu Leu His Ser Ser Gly Ser 
35 40 45 

Thr Ala Gin Glu Asn Ala Met Glu Gin Phe Val Tyr Ala Tyr Val Arg 
50 55 60 

Ser Cys Pro Gly Tyr Thr Leu Asp Tyr Asn Ala Asn Gly Ser Gly Ala 
65 70 75 " 80 

Gly Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asp 
85 90 95 

Val Pro Leu Asn Pro Ser Thr Gly Gin Pro Asp Arg Ser Ala Glu Arg 
100 105 " HO 

Cys Gly Ser Pro Ala Trp Asp Leu Pro Thr Val Phe Gly Pro lie Ala 
115 120 125 

lie Thr Tyr Asn lie Lys Gly Val Ser Thr Leu Asn Leu Asp Gly Pro 
130 135 140 

Thr Thr Ala Lys lie Phe Asn Gly Thr lie Thr Val Trp Asn Asp Pro 
145 150 155 160 

Gin lie Gin Ala Leu Asn Ser Gly Thr Asp Leu Pro Pro Thr Pro lie 
165 170 175 
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Ser Val lie Phe Arg Ser Asp Lys Ser Gly Thr Ser Asp Asn Phe Gin 
180 185 190 

Lys Tyr Leu Asp Gly Val Ser Asn Gly Ala Trp Gly Lys Gly Ala Ser 
195 200 205 

Glu Thr Phe Ser Gly Gly Val Gly Val Gly Ala Ser Gly Asn Asn Gly 
210 215 220 

Thr Ser Ala Leu Leu Gin Thr Thr Asp Gly Ser lie Thr Tyr Asn Glu 
225 230 235 240 

Trp Ser Phe Ala Val Gly Lys Gin Leu Asn Met Ala Gin lie lie Thr 
245 250 255 

Ser Ala Gly Pro Asp Pro Val Ala lie Thr Thr Glu Ser Val Gly Lys 
2 60 2 65 27 0 

Thr lie Ala Gly Ala Lys lie Met Gly Gin Gly Asn Asp Leu Val Leu 
275 280 285 

Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro lie 
290 295 300 

Val Leu Ala Thr Tyr Glu lie Val Cys Ser Lys Tyr Pro Asp Ala Thr 
305 310 315 * 320 

Thr Gly Thr Ala Val Arg Ala Phe Met Gin Ala Ala lie Gly Pro Gly 
325 330 335 

Gin Glu Gly Leu Asp Gin Tyr Gly Ser lie Pro Leu Pro Lys Ser Phe 
340 345 350 

Gin Ala Lys Leu Ala Ala Ala Val Asn Ala lie Ser 
355 360 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 309 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Gin Ala Ala Ala Gly Arg Ala Val Arg Arg Thr Gly His Ala Glu Asp 
1 5 10 15 

Gin Thr His Gin Asp Arg Leu His His Gly Cys Arg Arg Ala Ala Val 
20 25 30 

Val Val Arg Gin Asp Arg Ala Ser Val Ser Ala Thr Ser Ala Arg Pro 
35 40 45 

Pro Arg Arg His Pro Ala Gin Gly His Arg Arg Arg Val Ala Pro Ser 

50 55 60 



BNSDOCID: <WO 9816646A2JA> 



WO 98/16646 



PCT/US97/18293 



106 



Gly Gly Arg Arg Arg Pro His Pro His His Val Gin Pro Asp Asp Arg 
65 70 75 * 80 

Arg Asp Arg Pro Ala Leu Leu Asp Arg Thr Gin Pro Ala Glu His Pro 
85 90 95 

Asp Pro His Arg Arg Gly Pro Ala Asp Pro Gly Arg Val Arg Gly Arg 
100 105 110 

Gly Arg Leu Arg Arg Val Asp Asp Gly Arg Leu Gin Pro Asp Arg Asp 
115 120 125 

Ala Asp His Gly Ala Pro Val Arg Gly Arg Gly Pro His Arg Gly Val 
130 135 140 

Gin His Arg Gly Gly Pro Val Phe Val Arg Arg Val Pro Gly Val Arg 
145 150 155 " 160 

Cys Ala His Arg Arg Gly His Arg Arg Val Ala Ala Pro Gly Gin Gly 
165 170 175 

Asp Val Leu Arg Ala Gly Leu Arg Val Glu Arg Leu Arg Pro Val Ala 
180 185 190 

Ala Val Glu Asn Leu His Arg Gly Ser Gin Arg Ala Asp Gly Arg Val 
195 2 00 2 05 

Phe Arg Pro lie Arg Arg Gly Ala Arg Leu Pro Ala Arg Arg Ser Arg 
210 215 220 

Ala Gly Pro Gin Gly Arg Leu His Leu Asp Gly Ala Gly Pro Ser Pro 
225 230 235 240 

Leu Pro Ala Arg Ala Gly Gin Gin Gin Pro Ser Ser Ala Gly Gly Arg 
245 250 255 

Arg Ala Gly Gly Ala Glu Arg Ala Asp Pro Gly Gin Arg Gly Arg His 
260 265 270 

His Gin Gly Gly His Asp Pro Gly Arg Gin Gly Ala Gin Arg Gly Thr 
275 280 285 

Ala Gly Val Ala His Ala Ala Ala Gly Pro Arg Arg Ala Ala Val Arg 
290 295 300 

Asn Arg Pro Arg Arg 
305 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
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Ser Ala Val Trp Cys Leu Asn Gly Phe Thr Gly Arg His Arg His Gly 
1 5 10 15 

Arg Cys Arg Val Arg Ala Ser Gly Trp Arg Ser Ser Asn Arg Trp Cys 
20 25 30 

Ser Thr Thr Ala Asp Cys Cys Ala Ser Lys Thr Pro Thr Gin Ala Ala 
35 40 45 

Ser Pro Leu Glu Arg Arg Phe Thr Cys Cys Ser Pro Ala Val Gly Cys 
50 55 60 

Arg Phe Arg Ser Phe Pro Val Arg Arg Leu Ala Leu Gly Ala Arg Thr 
65 70 75 80 

Ser Arg Thr Leu Gly Val Arg Arg Thr Leu Ser Gin Trp Asn Leu Ser 
85 90 95 

Pro Arg Ala Gin Pro Ser Cys Ala Val Thr Val Glu Ser His Thr His 
100 105 HO 

Ala Ser Pro Arg Met Ala Lys Leu Ala Arg Val Val Gly Leu Val Gin 
115 120 125 

Glu Glu Gin Pro Ser Asp Met Thr Asn His Pro Arg Tyr Ser Pro Pro 
130 135 140 

Pro Gin Gin Pro Gly Thr Pro Gly Tyr Ala Gin Gly Gin Gin Gin Thr 
145 150 155 160 

Tyr Ser Gin Gin Phe Asp Trp Arg Tyr Pro Pro Ser Pro Pro Pro Gin 
165 170 175 

Pro Thr Gin Tyr Arg Gin Pro Tyr Glu Ala Leu Gly Gly Thr Arg Pro 
180 185 J 190 

Gly Leu lie Pro Gly Val lie Pro Thr Met Thr Pro Pro Pro Gly Met 
195 200 205 

Val Arg Gin Arg Pro Arg Ala Gly Met Leu Ala lie Gly Ala Val Thr 
210 215 220 

He Ala Val Val Ser Ala Gly He Gly Gly Ala Ala Ala Ser Leu Val 
225 230 235 240 

Gly Phe Asn Arg Ala Pro Ala Gly Pro Ser Gly Gly Pro Val Ala Ala 
245 250 255 

Ser Ala Ala Pro Ser He Pro Ala Ala Asn Met Pro Pro Gly Ser Val 
260 265 270 

Glu Gin Val Ala Ala Lys Val Val Pro Ser Val Val Met Leu Glu Thr 
275 280 285 

Asp Leu Gly Arg Gin Ser Glu Glu Gly Ser Gly He He Leu Ser Ala 
290 295 300 

Glu Gly Leu He Leu Thr Asn Asn His Val He Ala Ala Ala Ala Lys 
305 310 315 320 

Pro Pro Leu Gly Ser Pro Pro Pro Lys Thr Thr Val Thr Phe Ser Asp 
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325 330 335 

Gly Arg Thr Ala Pro Phe Thr Val Val Gly Ala Asp Pro Thr Ser Asp 
340 345 350 

lie Ala Val Val Arg Val Gin Gly Val Ser Gly Leu Thr Pro lie Ser 
355 360 365 

Leu Gly Ser Ser Ser Asp Leu Arg Val Gly Gin Pro Val Leu Ala lie 
370 375 380 

Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly lie Val Ser 
385 390 395 400 

Ala Leu Asn Arg Pro Val Ser Thr Thr Gly Glu Ala Gly Asn Gin Asn 
405 410 415 

Thr Val Leu Asp Ala lie Gin Thr Asp Ala Ala lie Asn Pro Gly Asn 
420 425 430 

Ser Gly Gly Ala Leu Val Asn Met Asn Ala Gin Leu Val Gly Val Asn 
4 35 440 4 45 

Ser Ala lie Ala Thr Leu Gly Ala Asp Ser Ala Asp Ala Gin Ser Gly 
450 455 460 

Ser lie Gly Leu Gly Phe Ala lie Pro Val Asp Gin Ala Lys Arg lie 
465 470 475 480 

Ala Asp Glu Leu lie Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly 
485 490 495 

Val Gin Val Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys lie Val Glu 
500 505 510 

Val Val Ala Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val 
515 520 525 

Val Val Thr Lys Val Asp Asp Arg Pro lie Asn Ser Ala Asp Ala Leu 
530 ~ 535 540 

Val Ala Ala Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr 
545 550 555 560 

Phe Gin Asp Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly 
565 570 575 

Lys Ala Glu Gin 
580 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val Val Leu 
15 10 15 

Gly Ala Cys Leu Ala Leu Trp Leu Ser Gly Cys Ser Ser Pro Lys Pro 
20 25 30 

Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr Ala Ser Asp Pro 
35 40 45 

Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala Thr Lys Glv Leu 
50 55 60 

Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys Val Asp Ser Leu 
65 70 75 ' 80 

Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala Asn Pro Leu Ala 
85 90 95 

Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly Val Pro Phe Arg 
100 105 no 

Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp Asp Trp Ser Asn 
115 120 125 

Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val Leu Asp Pro Ala 
130 135 140 

Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn Leu Gin Ala Gin 
145 150 155 160 

Gly Thr Glu Val He Asp Gly lie Ser Thr Thr Lys He Thr Gly Thr 
165 170 175 

He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly Ala Lys Ser Ala 
180 185 190 

Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser His His Leu Val 
195 200 205 

Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin Leu Thr Gin Ser 
210 215 220 

Lys Trp Asn Glu Pro Val Asn Val Asp 
225 230 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Val He Asp He He Gly Thr Ser Pro Thr Ser Trp Glu Gin Ala Ala 
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15 10 15 

Ala Glu Ala Val Gin Arg Ala Arg Asp Ser Val Asp Asp lie Arg Val 
20 25 ' 30 

Ala Arg Val lie Glu Gin Asp Met Ala Val Asp Ser Ala Gly Lys lie 
35 4 0 45 

Thr Tyr Arg lie Lys Leu Glu Val Ser Phe Lys Met Arg Pro Ala Gin 
50 55 60 

Pro Arg 
65 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Val Pro Pro Ala Pro Pro Leu Pro Pro Leu Pro Pro Ser Pro lie Ser 
15 10 15 

Cys Ala Ser Pro Pro Ser Pro Pro Leu Pro Pro Ala Pro Pro Val Ala 
20 25 30 

Pro Gly Pro Pro Met Pro Pro Leu Asp Pro Trp Pro Pro Ala Pro Pro 
35 40 45 

Leu Pro Tyr Ser Thr Pro Pro Gly Ala Pro Leu Pro Pro Ser Pro Pro 
50 55 60 

Ser Pro Pro Leu Pro 
65 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: 

Met Ser Asn Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu Leu Ser 
15 10 15 

Val Leu Ala Ala Val Gly Leu Gly Leu Ala Thr Ala Pro Ala Gin Ala 
20 25 30 
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Ala Pro Pro Ala Leu Ser Gin Asp Arg Phe Ala Asp Phe Pro Ala Leu 
35 40 45 

Pro Leu Asp Pro Ser Ala Met Val Ala Gin Val Ala Pro Gin Val Val 
50 55 60 

Asn lie Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr 
65 70 75 "* 80 

Gly lie Val lie Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val 
85 90 95 

lie Ala Gly Ala Thr Asp lie Asn Ala Phe Ser Val Gly Ser Gly Gin 
100 105 110 

Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gin Asp Val Ala 
115 120 125 

Val Leu Gin Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala lie Gly 
130 135 140 

Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser Gly 
145 150 155 ^ 160 

Gly Gin Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu 
165 170 175 

Gly Gin Thr Val Gin Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr 
180 185 190 

Leu Asn Gly Leu lie Gin Phe Asp Ala Ala lie Gin Pro Gly Asp Ser 
195 200 205 

Gly Gly Pro Val Val Asn Gly Leu Gly Gin Val Val Gly Met Asn Thr 
210 215 220 

Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe Ala 
225 230 235 240 

lie Pro lie Gly Gin Ala Met Ala lie Ala Gly Gin lie Arg Ser Gly 
245 250 255 

Gly Gly Ser Pro Thr Val His lie Gly Pro Thr Ala Phe Leu Gly Leu 
260 265 270 

Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val Val 
275 280 285 

Gly Ser Ala Pro Ala Ala Ser Leu Gly lie Ser Thr Gly Asp Val lie 
290 295 300 

Thr Ala Val Asp Gly Ala Pro lie Asn Ser Ala Thr Ala Met Ala Asp 
305 310 315 320 

Ala Leu Asn Gly His His Pro Gly Asp Val lie Ser Val Asn Trp Gin 
325 330 335 

Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu Gly 
340 345 350 

Pro Pro Ala 
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(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 205 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Ser Pro Lys Pro Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr 
! 5 10 15 

Ala Ser Asp Pro Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala 
20 25 30 

Thr Lys Gly Leu Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys 
35 40 45 

Val Asp Ser Leu Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala 
50 55 60 

Asn Pro Leu Ala Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly 
65 70 " 75 80 

Val Pro Phe Arg Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp 
85 90 95 

Asp Trp Ser Asn Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val 
100 105 HO 

Leu Asp Pro Ala Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn 
115 120 125 

Leu Gin Ala Gin Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys 
130 135 140 

He Thr Gly Thr He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly 
145 150 155 160 

Ala Lys Ser Ala Arg Pro Ala Thr Val Trp lie Ala Gin Asp Gly Ser 
165 170 175 

His His Leu Val Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin 
180 185 190 

Leu Thr Gin Ser Lys Trp Asn Glu Pro Val Asn Val Asp 
195 " 200 205 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 286 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Gly Asp Ser Phe Trp Ala Ala Ala Asp Gin Met Ala Arg Gly Phe Val 
15 10 15 

Leu Gly Ala Thr Ala Gly Arg Thr Thr Leu Thr Gly Glu Gly Leu Gin 
20 25 ~ 30 

His Ala Asp Gly His Ser Leu Leu Leu Asp Ala Thr Asn Pro Ala Val 
35 40 45 

Val Ala Tyr Asp Pro Ala Phe Ala Tyr Glu lie Gly Tyr lie Xaa Glu 
50 55 60 

Ser Gly Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn lie Phe Phe 
65 70 75 80 

Tyr lie Thr Val Tyr Asn Glu Pro Tyr Val Gin Pro Pro Glu Pro Glu 
85 90 95 

Asn Phe Asp Pro Glu Gly Val Leu Gly Gly lie Tyr Arg Tyr His Ala 
100 105 HO 

Ala Thr Glu Gin Arg Thr Asn Lys Xaa Gin lie Leu Ala Ser Gly Val 
115 120 125 

Ala Met Pro Ala Ala Leu Arg Ala Ala Gin Met Leu Ala Ala Glu Trp 
130 135 140 

Asp Val Ala Ala Asp Val Trp Ser Val Thr Ser Trp Gly Glu Leu Asn 
145 150 155 160 

Arg Asp Gly Val Val lie Glu Thr Glu Lys Leu Arg His Pro Asp Arg 
165 170 175 

Pro Ala Gly Val Pro Tyr Val Thr Arg Ala Leu Glu Asn Ala Arg Gly 
180 185 190 

Pro Val lie Ala Val Ser Asp Trp Met Arg Ala Val Pro Glu Gin lie 
195 200 205 

Arg Pro Trp Val Pro Gly Thr Tyr Leu Thr Leu Gly Thr Asp Gly Phe 
210 215 220 

Gly Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr Phe Asn Thr Asp 
225 230 235 240 

Ala Glu Ser Gin Val Gly Arg Gly Phe Gly Arg Gly Trp Pro Gly Arg 
245 250 255 

Arg Val Asn lie Asp Pro Phe Gly Ala Gly Arg Gly Pro Pro Ala Gin 
260 265 " 270 

Leu Pro Gly Phe Asp Glu Gly Gly Gly Leu Arg Pro Xaa Lys 
275 280 " 285 

(2) INFORMATION FOR SEQ ID NO: 82: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Thr Lys Phe His Ala Leu Met Gin Glu Gin lie His Asn Glu Phe Thr 
15 10 15 

Ala Ala Gin Gin Tyr Val Ala lie Ala Val Tyr Phe Asp Ser Glu Asp 
20 25 30 

Leu Pro Gin Leu Ala Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg 
35 40 45 

Asn His Ala Met Met Leu Val Gin His Leu Leu Asp Arg Asp Leu Arg 
50 55 60 

Val Glu lie Pro Gly Val Asp Thr Val Arg Asn Gin Phe Asp Arg Pro 
65 70 75 J 80 

Arg Glu Ala Leu Ala Leu Ala Leu Asp Gin Glu Arg Thr Val Thr Asp 
8 5 90 95 

Gin Val Gly Arg Leu Thr Ala Val Ala Arg Asp Glu Gly Asp Phe Leu 
100 105 110 

Gly Glu Gin Phe Met Gin Trp Phe Leu Gin Glu Gin lie Glu Glu Val 
115 120 125 

Ala Leu Met Ala Thr Leu Val Arg Val Ala Asp Arg Ala Gly Ala Asn 
130 135 140 

Leu Phe Glu Leu Glu Asn Phe Val Ala Arg Glu Val Asp Val Ala Pro 
145 150 155 160 

Ala Ala Ser Gly Ala Pro His Ala Ala Gly Gly Arg Leu 
165 170 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 

Arg Ala Asp Glu Arg Lys Asn Thr Thr Met Lys Met Val Lys Ser lie 
15 10 15 
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Ala Ala Gly Leu Thr Ala Ala Ala Ala lie Gly Ala Ala Ala Ala Gly 
20 25 30 

Val Thr Ser lie Met Ala Gly Gly Pro Val Val Tyr Gin Met Gin Pro 
35 40 45 

Val Val Phe Gly Ala Pro Leu Pro Leu Asp Pro Xaa Ser Ala Pro Xaa 
50 55 60 

Val Pro Thr Ala Ala Gin Trp Thr Xaa Leu Leu Asn Xaa Leu Xaa Asp 
65 70 75 80 

Pro Asn Val Ser Phe Xaa Asn Lys Gly Ser Leu Val Glu Gly Gly lie 
85 90 95 

Gly Gly Xaa Glu Gly Xaa Xaa Arg Arg Xaa Gin 
100 105 

(2) INFORMATION FOR SEQ ID NO: 84: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Val Leu Ser Val Pro Val Gly Asp Gly Phe Trp Xaa Arg Val Val Asn 
15 10 15 

Pro Leu Gly Gin Pro lie Asp Gly Arg Gly Asp Val Asp Ser Asp Thr 
20 25 30 

Arg Arg Ala Leu Glu Leu Gin Ala Pro Ser Val Val Xaa Arg Gin Gly 
35 40 45 

Val Lys Glu Pro Leu Xaa Thr Gly lie Lys Ala lie Asp Ala Met Thr 
50 55 60 

Pro lie Gly Arg Gly Gin Arg Gin Leu lie lie Gly Asp Arg Lys Thr 
65 70 75 80 

Gly Lys Asn Arg Arg Leu Cys Arg Thr Pro Ser Ser Asn Gin Arg Glu 
85 90 95 

Glu Leu Gly Val Arg Trp lie Pro Arg Ser Arg Cys Ala Cys Val Tyr 
100 ' 105 110 

Val Gly His Arg Ala Arg Arg Gly Thr Tyr His Arg Arg 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Cys Asp Ala Val Met Gly Phe Leu Gly Gly Ala Gly Pro Leu Ala Val 
15 10 15 

Val Asp Gin Gin Leu Val Thr Arg Val Pro Gin Gly Trp Ser Phe Ala 
20 25 30 

Gin Ala Ala Ala Val Pro Val Val Phe Leu Thr Ala Trp Tyr Gly Leu 
35 40 45 

Ala Asp Leu Ala Glu lie Lys Ala Gly Glu Ser Val Leu lie His Ala 
50 55 60 

Gly Thr Gly Gly Val Gly Met Ala Ala Val Gin Leu Ala Arg Gin Trp 
65 ~ 70 75 80 

Gly Val Glu Val Phe Val Thr Ala Ser Arg Gly Lys Trp Asp Thr Leu 
8 5 90 95 

Arg Ala Xaa Xaa Phe Asp Asp Xaa Pro Tyr Arg Xaa Phe Pro His Xaa 
100 105 110 



Arg Ser Ser Xaa Gly 
115 

(2) INFORMATION FOR SEQ ID NO : 8 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ II 

Met Tyr Arg Phe Ala Cys Arg Thr 
1 5 

Ala Thr Gly Val Ala Gly Leu Gly 
20 

Thr Ala Pro Val Pro Asp' Tyr Tyr 

35 40 

Pro Ala Trp Gly Pro Asn Trp Asp 
50 55 

His Arg Asp Ser Asp Gly Pro Asp 
65 " 70 

lie Leu Glu Gly Pro Val Leu Asp 



i NO: 86: 

Leu Met Leu Ala Ala Cys lie Leu 
10 15 

Val Gly Ala Gin Ser Ala Ala Gin 
25 30 

Trp Cys Pro Gly Gin Pro Phe Asp 
45 

Pro Tyr Thr Cys His Asp Asp Phe 
60 

His Ser Arg Asp Tyr Pro Gly Pro 
75 80 

Asp Pro Gly Ala Ala Pro Pro Pro 
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85 90 95 

Pro Ala Ala Gly Gly Gly Ala 
100 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Val Gin Cys Arg Val Trp Leu Glu lie Gin Trp Arg Gly Met Leu Gly 
1 5 10 15 

Ala Asp Gin Ala Arg Ala Gly Gly Pro Ala Arg lie Trp Arg Glu His 
20 25 30 

Ser Met Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala 
35 4 0 4 5 

Thr Lys Glu Gly Arg Gly lie Val Met Arg Val Pro Leu Glu Gly Gly 
50 55 60 

Gly Arg Leu Val Val Glu Leu Thr Pro Asp Glu Ala Ala Ala Leu Gly 
65 70 75 80 

Asp Glu Leu Lys Gly Val Thr Ser 
85 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He 
1 5 10 15 

Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly 
20 25 30 

Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala 
35 40 45 

Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu 
50 55 60 
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Asp Glu lie Ser Thr Asn lie Arg Gin Ala Gly Val Gin Tyr Ser Arg 
65 70 75 J 80 

Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
85 90 95 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu lie Leu Asn 
1 5 10 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp Val 
20 25 30 

Pro He Thr Pro Cys Glu Leu Thr Xaa Xaa Lys Asn Ala Ala Gin Gin 
35 40 45 

Xaa Val Leu Ser Ala Asp Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala 
50 55 60 

Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lys Xaa 
65 70 75 80 

Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
85 90 95 

Glu Gly Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 
100 105 HO 

Ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 

Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 
130 135 140 

Gin Gly Ala Ser Leu Ala His Xaa Gly Asp Gly Trp Asn Thr Xaa Thr 
145 150 155 160 

Leu Thr Leu Gin Gly Asp 
165 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Arg Ala Glu Arg Met 
1 5 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 : 

Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala 
15 10 15 

Gin Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr 
20 25 30 

Val Pro Pro Pro Val lie Ala Glu Asn Arg Ala Glu Leu Met lie Leu 
35 40 45 

lie Ala Thr Asn Leu Leu Gly Gin Asn Thr Pro Ala lie Ala Val Asn 
50 55 60 

Glu Ala Glu Tyr Gly Glu Met Trp Ala Gin Asp Ala Ala Ala Met Phe 
65 70 75 80 

Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 
85 90 95 

Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gin Ala 
100 105 " HO 

Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gin Leu Met 
115 120 125 

Asn Asn Val Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr Gin Gly 
130 135 140 

Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 
145 150 155 160 

His Arg Ser Pro lie Ser Asn Met Val Ser Met Ala Asn Asn His Met 
165 170 175 

Ser Met Thr Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser Met 
180 185 190 

Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala Gin Ala Val Gin Thr Ala 
195 200 205 
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Ala Gin Asn Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu Gly 
210 " 215 220 

Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala 
225 J 230 235 240 

Ser Val Arg Tyr Gly His Arg Asp Gly Gly Lys Tyr Ala Xaa Ser Gly 
245 250 255 

Arg Arg Asn Gly Gly Pro Ala 
2 60 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Met Thr Tyr Ser Pro Gly Asn Pro Gly Tyr Pro Gin Ala Gin Pro Ala 
i 5 10 15 

Gly Ser Tyr Gly Gly Val Thr Pro Ser Phe Ala His Ala Asp Glu Gly 
20 " 25 30 

Ala Ser Lys Leu Pro Met Tyr Leu Asn lie Ala Val Ala Val Leu Gly 
35 40 45 

Leu Ala Ala Tyr Phe Ala Ser Phe Gly Pro Met Phe Thr Leu Ser Thr 
50 55 60 

Glu Leu Gly Gly Gly Asp Gly Ala Val Ser Gly Asp Thr Gly Leu Pro 
65 " ' 70 75 80 

Val Gly Val Ala Leu Leu Ala Ala Leu Leu Ala Gly Val Val Leu Val 
85 90 95 

Pro Lys Ala Lys Ser His Val Thr Val Val Ala Val Leu Gly Val Leu 
100 105 HO 

Gly Val Phe Leu Met Val Ser Ala Thr Phe Asn Lys Pro Ser Ala Tyr 
115 120 125 

Ser Thr Gly Trp Ala Leu Trp Val Val Leu Ala Phe lie Val Phe Gin 
130 * 135 140 

Ala Val Ala Ala Val Leu Ala Leu Leu Val Glu Thr Gly Ala He Thr 
145 150 155 160 

Ala Pro Ala Pro Arg Pro Lys Phe Asp Pro Tyr Gly Gin Tyr Gly Arg 
165 170 175 

Tvr Gly Gin Tyr Gly Gin Tyr Gly Val Gin Pro Gly Gly Tyr Tyr Gly 
180 ' 185 190 
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Gin Gin Gly Ala Gin Gin Ala Ala Gly Leu Gin Ser Pro Gly Pro Gin 
195 200 205 

Gin Ser Pro Gin Pro Pro Gly Tyr Gly Ser Gin Tyr Gly Gly Tyr Ser 
210 215 220 

Ser Ser Pro Ser Gin Ser Gly Ser Gly Tyr Thr Ala Gin Pro Pro Ala 
225 230 235 240 

Gin Pro Pro Ala Gin Ser Gly Ser Gin Gin Ser His Gin Gly Pro Ser 
245 250 255 

Thr Pro Pro Thr Gly Phe Pro Ser Phe Ser Pro Pro Pro Pro Val Ser 
260 265 270 

Ala Gly Thr Gly Ser Gin Ala Gly Ser Ala Pro Val Asn Tyr Ser Asn 
275 280 285 

Pro Ser Gly Gly Glu Gin Ser Ser Ser Pro Gly Gly Ala Pro Val 
290 295 300 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Gly Cys Gly Glu Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn 
15 10 15 

Phe Glu Arg lie Ser Gly Asp Leu Lys Thr Gin lie 
20 25 

(2) INFORMATION FOR SEQ ID NO: 94 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 95: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Gly Cys Gly Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala 
15 10 15 

Ala Gly Thr Ala Ala Gin Ala Ala Val Val Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Gly Cys Gly Gly Thr Ala Ala Gin Ala Ala Val Val Arg Phe Gin Glu 
1 5 10 15 

Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Gly Cys Gly Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu lie Ser Thr 
1 5 10 15 

Asn lie Arg Gin Ala Gly Val Gin Tyr Ser Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 98: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Gly Cys Gly lie Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu 
15 10 15 

Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
20 25 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 507 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 



ATGAAGATGG 


TGAAATCGAT 


CGCCGCAGGT 


CTGACCGCCG 


CGGCTGCAAT 


CGGCGCCGCT 


60 


GCGGCCGGTG 


TGACTTCGAT 


CATGGCTGGC 


GGCCCGGTCG 


TATACCAGAT 


GCAGCCGGTC 


120 


GTCTTCGGCG 


CGCCACTGCC 


GTTGGACCCG 


GCATCCGCCC 


CTGACGTCCC 


GACCGCCGCC 


180 


CAGTTGACCA 


GCCTGCTCAA 


CAGCCTCGCC 


GATCCCAACG 


TGTCGTTTGC 


GAACAAGGGC 


240 


AGTCTGGTCG 


AGGGCGGCAT 


CGGGGGCACC 


GAGGCGCGCA 


TCGCCGACCA 


CAAGCTGAAG 


300 


AAGGCCGCCG 


AGCACGGGGA 


TCTGCCGCTG 


TCGTTCAGCG 


TGACGAACAT 


CCAGCCGGCG 


360 


GCCGCCGGTT 


CGGCCACCGC 


CGACGTTTCC 


GTCTCGGGTC 


CGAAGCTCTC 


GTCGCCGGTC 


420 


ACGCAGAACG 


TCACGTTCGT 


GAATCAAGGC 


GGCTGGATGC 


TGTCACGCGC 


ATCGGCGATG 


480 


GAGTTGCTGC 


AGGCCGCAGG 


GAACTGA 








507 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
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Met 
1 



Lys Met Val Lys Ser He Ala Ala Gly Leu Thr Ala Ala Ala Ala 
5 10 15 



He 



Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala Gly Gly Pro 
20 25 30 



Val 



Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro Leu Pro Leu 
35 40 45 



Asp 



Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
50 55 60 



Leu 
65 



Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn Lys Gly 
70 75 80 



Ser 



Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg He Ala Asp 
85 90 95 



His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro Leu Ser Phe 
100 105 HO 

Ser Val Thr Asn lie Gin Pro Ala Ala Ala Gly Ser Ala Thr Ala Asp 
115 120 125 

Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr Gin Asn Val 
130 135 140 

Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala Ser Ala Met 
145 150 155 160 

Glu Leu Leu Gin Ala Ala Gly Asn 
165 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 



CGTGGCAATG 


TCGTTGACCG 


TCGGGGCCGG 


GGTCGCCTCC 


GCAGATCCCG 


TGGACGCGGT 


60 


CAT T AAC AC C 


ACCTGCAATT 


ACGGGCAGGT 


AGTAGCTGCG 


CTCAACGCGA 


CGGATCCGGG 


120 


GGCTGCCGCA 


CAGTTCAACG 


CCTCACCGGT 


GGCGCAGTCC 


TATTTGCGCA 


ATTTCCTCGC 


180 


CGCACCGCCA 


CCTCAGCGCG 


CTGCCATGGC 


CGCGCAATTG 


CAAGCTGTGC 


CGGGGGCGGC 


240 


AC AG T AC AT C 


GGCCTTGTCG 


AGTCGGTTGC 


CGGCTCCTGC 


AAC AAC TAT T 


AAGCCCATGC 


300 


GGGCCCCATC 


CCGCGACCCG 


GCATCGTCGC 


CGGGGCTAGG 


CCAGATTGCC 


CCGCTCCTCA 


360 


ACGGGCCGCA 


TCCCGCGACC 


CGGCATCGTC 


GCCGGGGCTA 


GGCCAGATTG 


CCCCGCTCCT 


420 


CAACGGGCCG 


CATCTCGTGC 


CGAATTCCTG 


CAGCCCGGGG 


GATCCACTAG 


TTCTAGAGCG 


480 
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GCCGCCACCG CGGTGGAGCT 50 0 
(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Val Ala Met Ser Leu Thr Val Gly Ala Gly Val Ala Ser Ala Asp Pro 
15 10 15 

Val Asp Ala Val lie Asn Thr Thr Cys Asn Tyr Gly Gin Val Val Ala 
20 25 30 

Ala Leu Asn Ala Thr Asp Pro Gly Ala Ala Ala Gin Phe Asn Ala Ser 
35 40 45 

Pro Val Ala Gin Ser Tyr Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro 
50 55 60 

Gin Arg Ala Ala Met Ala Ala Gin Leu Gin Ala Val Pro Gly Ala Ala 
65 70 75 " 80 

Gin Tyr lie Gly Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 
85 90 " " 95 

(2) INFORMATION FOR SEQ ID NO: 103: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
AT G AC AG AG C AG C AG T G G AA TTTCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCCAGGGA 60 
AATGTCACGT CCATTCATTC CCTCCTTGAC GAGGGGAAGC AGTCCCTGAC CAAGCTCGCA 12 0 

GCGGCCTGGG GCGGTAGCGG TTCGGAAGCG TACC 154 
(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Met Thr Glu Gin Gin Trp Asn Phe Ala Gly lie Glu Ala Ala Ala Ser 
1 5 " 10 15 

Ala lie Gin Gly Asn Val Thr Ser lie His Ser Leu Leu Asp Glu Gly 
20 25 30 

Lys Gin Ser Leu Thr Lys Leu Ala Ala Ala Trp Gly Gly Ser Gly Ser 
35 40 4 5 

Glu Ala Tyr 
50 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

CGGTCGCGCA CTTCCAGGTG AC TAT G AAAG TCGGCTTCCG NCTGGAGGAT TCCTGAACCT 60 

TCAAGCGCGG CCGATAACTG AGGTGCATCA TTAAGCGACT TTTCCAGAAC ATCCTGACGC 120 

GCTCGAAACG CGGCACAGCC GACGGTGGCT CCGNCGAGGC GCTGNCTCCA AAATCCCTGA 180 

GACAATTCGN CGGGGGCGCC TACAAGGAAG TCGGTGCTGA ATTCGNCGNG TATCTGGTCG 24 0 

ACCTGTGTGG TCTGNAGCCG GACGAAGCGG TGCTCGACGT CG 282 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

GATCGTACCC GTGCGAGTGC TCGGGCCGTT TGAGGATGGA GTGCACGTGT CTTTCGTGAT 60 

G G CAT AC C C A GAGATGTTGG CGGCGGCGGC TGACACCCTG CAGAGCATCG GTGCTACCAC 120 

TGTGGCTAGC AATGCCGCTG CGGCGGCCCC GACGACTGGG GTGGTGCCCC CCGCTGCCGA 180 
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TGAGGTGTCG GCGCTGACTG CGGCGCACTT CGCCGCACAT GCGGCGATGT ATCAGTCCGT 24 0 

GAGCGCTCGG GCTGCTGCGA TTCATGACCA GTTCGTGGCC ACCCTTGCCA GCAGCGCCAG 300 

CTCGTATGCG GCCACTGAAG TCGCCAATGC GGCGGCGGCC AGCTAAGCCA GGAACAGTCG 360 

GCACGAGAAA CCACGAGAAA TAGGGACACG TAATGGTGGA TTTCGGGGCG TTACCACCGG 4 20 

AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC CTCGCTGGTG GCCGCGGCTC 4 80 

AGATGTGGGA CAGCGTGGCG AGTGACCTGT TTTCGGCCGC GTCGGCGTTT CAGTCGGTGG 54 0 

TCTGGGGTCT GACGGTGGGG TCGTGGATAG GTTCGTCGGC GGGTCTGATG GTGGCGGCGG 600 

CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA GGCCGAGCTG ACCGCCGCCC 6 60 

AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG GCTGACGGTG CCCCCGCCGG 720 

TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC GACCAACCTC TTGGGGCAAA 7 80 

ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGCGA GATGTGGGCC CAAGACGCCG 84 0 

CCGCGATGTT TGGCTACGCC GCGGCGACGG CGACGGCGAC GGCGACGTTG CTGCCGTTCG 900 

AGGAGGCGCC GGAGATGACC AGCGCGGGTG GGCTCCTCGA GCAGGCCGCC GCGGTCGAGG 960 

AGGCCTCCGA CACCGCCGCG GCGAACCAGT TGATGAACAA TGTGCCCCAG GCGCTGCAAC 1020 

AGCTGGCCCA GCCCACGCAG GGCACCACGC CTTCTTCCAA GCTGGGTGGC CTGTGGAAGA 1080 

CGGTCTCGCC GCATCGGTCG CCGATCAGCA ACATGGTGTC GATGGCCAAC AACCACATGT 114 0 

CGATGACCAA CTCGGGTGTG TCGATGACCA ACACCTTGAG CTCGATGTTG AAGGGCTTTG 1200 

CTCCGGCGGC GGCCGCCCAG GCCGTGCAAA CCGCGGCGCA AAACGGGGTC CGGGCGATGA 12 60 

GCTCGCTGGG CAGCTCGCTG GGTTCTTCGG GTCTGGGCGG TGGGGTGGCC GCCAACTTGG 1320 

GTCGGGCGGC CTCGGTCGGT TCGTTGTCGG TGCCGCAGGC CTGGGCCGCG GCCAACCAGG 1380 

CAGTCACCCC GGCGGCGCGG GCGCTGCCGC TGACCAGCCT GACCAGCGCC GCGGAAAGAG 14 4 0 

GGCCCGGGCA GATGCTGGGC GGGCTGCCGG TGGGGCAGAT GGGCGCCAGG GCCGGTGGTG 1500 

GGCTCAGTGG TGTGCTGCGT GTTCCGCCGC GACCCTATGT GATGCCGCAT TCTCCGGCGG 1560 

CCGGCTAGGA GAGGGGGCGC AGACTGTCGT TATTTGACCA GTGATCGGCG GTCTCGGTGT 162 0 

TTCCGCGGCC GGCTATGACA ACAGTCAATG TGCATGACAA GTTACAGGTA TTAGGTCCAG 1680 

GTTCAACAAG GAGACAGGCA ACATGGCCTC ACGTTTTATG ACGGATCCGC ACGCGATGCG 17 4 0 

GGACATGGCG GGCCGTTTTG AGGTGCACGC CCAGACGGTG GAGGACGAGG CTCGCCGGAT 18 00 

GTGGGCGTCC GCGCAAAACA TTTCCGGTGC GGGCTGGAGT GGCATGGCCG AGGCGACCTC 18 60 

GCTAGACACC ATGGCCCAGA TGAATCAGGC GTTTCGCAAC ATCGTGAACA TGCTGCACGG 1920 

GGTGCGTGAC GGGCTGGTTC GCGACGCCAA CAACTACGAG CAGCAAGAGC AGGCCTCCCA 1980 

GCAGATCCTC AGCAGCTAAC GTCAGCCGCT GCAGCACAAT ACTTTTACAA GCGAAGGAGA 204 0 
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ACAGGTTCGA 


TGACCATCAA 


CTATCAATTC 


GGGGATGTCG 


ACGCTCACGG 


CGCCATGATC 


2100 


CGCGCTCAGG 


CCGGGTTGCT 


GGAGGCCGAG 


CATCAGGCCA 


TCATTCGTGA 


TGTGTTGACC 


2160 


GCGAGTGACT 


TTTGGGGCGG 


CGCCGGTTCG 


GCGGCCTGCC 


AGGGGTTCAT 


TACCCAGTTG 


2220 


GGCCGTAACT 


TCCAGGTGAT 


CTACGAGCAG 


GCCAACGCCC 


ACGGGCAGAA 


GGTGCAGGCT 


2280 


GCCGGCAACA 


ACATGGCGCA 


AACCGACAGC 


GCCGTCGGCT 


CCAGCTGGGC 


CTGACACCAG 


2340 


GCCAAGGCCA 


GGGACGTGGT 


GTACGAGTGA 


AGTTCCTCGC 


GTGATCCTTC 


GGGTGGCAGT 


2400 


CTAAGTGGTC 


AGTGCTGGGG 


TGTTGGTGGT 


TTGCTGCTTG 


GCGGGTTCTT 


CGGTGCTGGT 


2460 


CAGTGCTGCT 


CGGGCTCGGG 


TGAGGACCTC 


GAGGCCCAGG 


TAGCGCCGTC 


CTTCGATCCA 


2520 


TTCGTCGTGT 


TGTTCGGCGA 


GGACGGCTCC 


GACGAGGCGG 


ATGATCGAGG 


CGCGGTCGGG 


2580 


GAAGATGCCC 


ACGACGTCGG 


TTCGGCGTCG 


TACCTCTCGG 


TTGAGGCGTT 


CCTGGGGGTT 


2640 


GTTGGACCAG 


ATTTGGCGCC 


AGATCTGCTT 


GGGGAAGGCG 


GTGAACGCCA 


GCAGGTCGGT 


2700 


GCGGGCGGTG 


TCGAGGTGCT 


CGGCCACCGC 


GGGGAGTTTG 


TCGGTCAGAG 


CGTCGAGTAC 


2760 


CCGATCATAT 


TGGGCAACAA 


CTGATTCGGC 


GTCGGGCTGG 


TCGTAGATGG 


AGTGCAGCAG 


2820 


GGTGCGCACC 


CACGGCCAGG 


AGGGCTTCGG 


GGTGGCTGCC 


ATCAGATTGG 


CTGCGTAGTG 


2880 


GGTTCTGCAG 


CGCTGCCAGG 


CCGCTGCGGG 


CAGGGTGGCG 


CCGATCGCGG 


CCACCAGGCC 


2940 


GGCGTGGGCG 


TCGCTGGTGA 


CCAGCGCGAC 


CCCGGACAGG 


CCGCGGGCGA 


CCAGGTCGCG 


3000 


GAAGAACGCC 


AGCCAGCCGG 


CCCCGTCCTC 


GGCGGAGGTG 


ACCTGGATGC 


CCAGGATC 


3058 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 391 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Met Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
15 10 15 

Tvr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Gin Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 ^O 45 

Val Val Trp Gly Leu Thr Val Gly Ser Trp He Gly Ser Ser Ala Gly 
50 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
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65 70 75 80 

Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 " 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val lie Ala 
100 105 110 

Glu Asn Arg Ala Glu Leu Met lie Leu lie Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala lie Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr Ala 
145 150 155 160 

Thr Ala Thr Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr 
165 170 175 

Ser Ala Gly Gly Leu Leu Glu Gin Ala Ala Ala Val Glu Glu Ala Ser 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Gin Gly Thr Thr Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro lie Ser Asn 
225 230 235 240 

Met Val Ser Met Ala Asn Asn His Met Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Thr Asn Thr Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 " 270 

Ala Ala Ala Gin Ala Val Gin Thr Ala Ala Gin Asn Gly Val Arg Ala 
275 280 285 

Met Ser Ser Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly Gly Gly 
2 90 295 300 

Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val 
305 310 315 320 

Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro Ala Ala Arg 
325 330 335 

Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro Gly 
340 345 350 

Gin Met Leu Gly Gly Leu Pro Val Gly Gin Met Gly Ala Arg Ala Gly 
355 360 365 

Gly Gly Leu Ser Gly Val Leu Arg Val Pro Pro Arg Pro Tyr Val Met 
370 375 380 

Pro His Ser Pro Ala Ala Gly 
385 390 
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(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 



GACGTCAGCA 


CCCGCCGTGC 


AGGGCTGGAG 


CGTGGTCGGT 


TTTGATCTGC 


GGTCAAGGTG 


60 


ACGTCCCTCG 


GCGTGTCGCC 


GGCGTGGATG 


CAGACTCGAT 


GCCGCTCTTT 


AGTGCAACTA 


120 


ATTTCGTTGA 


AGTGCCTGCG 


AGGTATAGGA 


CTTCACGATT 


GGTTAATGTA 


GCGTTCACCC 


180 


CGTGTTGGGG 


TCGATTTGGC 


CGGACCAGTC 


GTCACCAACG 


CTTGGCGTGC 


GCGCCAGGCG 


240 


G GC GAT C AG A 


TCGCTTGACT 


ACCAATCAAT 


CTTGAGCTCC 


CGGGCCGATG 


CTCGGGCTAA 


300 


AT GAG GAG G A 


GCACGCGTGT 


CTTTCACTGC 


GCAACCGGAG 


ATGTTGGCGG 


CCGCGGCTGG 


360 


CGAACTTCGT 


TCCCTGGGGG 


CAACGCTGAA 


GGCTAGCAAT 


GCCGCCGCAG 


CCGTGCCGAC 


420 


GACTGGGGTG 


GTGCCCCCGG 


CTGCCGACGA 


GGTGTCGCTG 


CTGCTTGCCA 


CACAATTCCG 


480 


TACGCATGCG 


GCGACGTATC 


AGACGGCCAG 


CGCCAAGGCC 


GCGGTGATCC 


ATGAGCAGTT 


540 


TGTGACCACG 


CTGGCCACCA 


GCGCTAGTTC 


ATATGCGGAC 


ACCGAGGCCG 


CCAACGCTGT 


600 


GGTCACCGGC 


TAGCTGACCT 


GACGGTATTC 


GAGCGGAAGG 


ATTATCGAAG 


TGGTGGATTT 


660 


CGGGGCGTTA 


CCACCGGAGA 


TCAACTCCGC 


GAGGATGTAC 


GCCGGCCCGG 


GTTCGGCCTC 


720 


GCTGGTGGCC 


GCCGCGAAGA 


TGTGGGACAG 


CGTGGCGAGT 


GACCTGTTTT 


CGGCCGCGTC 


780 


GGCGTTTCAG 


TCGGTGGTCT 


GGGGTCTGAC 


GGTGGGGTCG 


TGGATAGGTT 


CGTCGGCGGG 


840 


TCTGATGGCG 


GCGGCGGCCT 


CGCCGTATGT 


GGCGTGGATG 


AGCGTCACCG 


CGGGGCAGGC 


900 


CCAGCTGACC 


GCCGCCCAGG 


TCCGGGTTGC 


TGCGGCGGCC 


TACGAGACAG 


CGTATAGGCT 


960 


GACGGTGCCC 


CCGCCGGTGA 


TCGCCGAGAA 


CCGTACCGAA 


CTGATGACGC 


TGACCGCGAC 


1020 


CAACCTCTTG 


GGGCAAAACA 


CGCCGGCGAT 


CGAGGCCAAT 


CAGGCCGCAT 


AC AG CCA GAT 


1080 


GTGGGGCCAA 


GACGCGGAGG 


CGATGTATGG 


CTACGCCGCC 


ACGGCGGCGA 


CGGCGACCGA 


1140 


GGCGTTGCTG 


CCGTTCGAGG 


ACGCCCCACT 


GATCACCAAC 


CCCGGCGGGC 


TCCTTGAGCA 


1200 


GGCCGTCGCG 


GTCGAGGAGG 


CCATCGACAC 


CGCCGCGGCG 


AACCAGTTGA 


TGAACAATGT 


1260 


GCCCCAAGCG 


CTGCAACAGC 


TGGCCCAGCC 


AGCGCAGGGC 


GTCGTACCTT 


CTTCCAAGCT 


1320 


GGGTGGGCTG 


TGGACGGCGG 


TCTCGCCGCA 


TCTGTCGCCG 


CTCAGCAACG 


TCAGTTCGAT 


1380 


AGCCAACAAC 


CACATGTCGA 


TGATGGGCAC 


GGGTGTGTCG 


ATGACCAACA 


CCTTGCACTC 


1440 
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GATGTTGAAG 


GGCTTAGCTC 


CGGCGGCGGC 


TCAGGCCGTG 


GAAACCGCGG 


CGGAAAACGG 


1500 


GGTCTGGGCG 


ATGAGCTCGC 


TGGGCAGCCA 


GCTGGGTTCG 


TCGCTGGGTT 


CTTCGGGTCT 


1560 


GGGCGCTGGG 


GTGGCCGCCA 


ACTTGGGTCG 


GGCGGCCTCG 


GTCGGTTCGT 


TGTCGGTGCC 


1620 


GCCAGCATGG 


GCCGCGGCCA 


ACCAGGCGGT 


CACCCCGGCG 


GCGCGGGCGC 


TGCCGCTGAC 


1680 


CAGCCTGACC 


AGCGCCGCCC 


AAACCGCCCC 


CGGACACATG 


CTGGG 




1725 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Glu lie Asn Ser Ala Arg Met 
10 15 

Val Ala Ala Ala Lys Met Trp 
30 

Ala Ala Ser Ala Phe Gin Ser 
45 

Trp lie Gly Ser Ser Ala Gly 
60 

Val Ala Trp Met Ser Val Thr 
75 80 

Gin Val Arg Val Ala Ala Ala 
90 95 

Val Pro Pro Pro Val lie Ala 
110 

Thr Ala Thr Asn Leu Leu Gly 
125 

Gin Ala Ala Tyr Ser Gin Met 
140 

Gly Tyr Ala Ala Thr Ala Ala 
155 160 

Glu Asp Ala Pro Leu lie Thr 
170 . 175 

Val Ala Val Glu Glu Ala He 
190 

Asn Asn Val Pro Gin Ala Leu 
205 
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Val Val Asp Phe Gly Ala Leu Pro Pro 
1 5 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu 
20 25 

Asp Ser Val Ala Ser Asp Leu Phe Ser 
35 40 

Val Val Trp Gly Leu Thr Val Gly Ser 
50 55 

Leu Met Ala Ala Ala Ala Ser Pro Tyr 
65 70 

Ala Gly Gin Ala Gin Leu Thr Ala Ala 
85 

Ala Tyr Glu Thr Ala Tyr Arg Leu Thr 
100 105 

Glu Asn Arg Thr Glu Leu Met Thr Leu 
115 120 

Gin Asn Thr Pro Ala He Glu Ala Asn 
130 135 

Trp Gly Gin Asp Ala Glu Ala Met Tyr 
145 150 

Thr Ala Thr Glu Ala Leu Leu Pro Phe 
165 

Asn Pro Gly Gly Leu Leu Glu Gin Ala 

180 185 

Asp Thr Ala Ala Ala Asn Gin Leu Met 
195 200 
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Gin Gin Leu Ala Gin Pro Ala Gin Gly Val Val Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Thr Ala Val Ser Pro His Leu Ser Pro Leu Ser Asn 
225 ' 230 235 240 

Val Ser Ser lie Ala Asn Asn His Met Ser Met Met Gly Thr Gly Val 
245 250 255 

Ser Met Thr Asn Thr Leu His Ser Met Leu Lys Gly Leu Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Glu Asn Gly Val Trp Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 "* 295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Pro Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
34 0 34 5 350 

Ala Pro Gly His Met Leu Gly 
355 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3027 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



AGTTCAGTCG 


AG AAT GAT AC 


TGACGGGCTG 


TATCCACGAT 


GGCTGAGACA 


ACCGAACCAC 


60 


CGTCGGACGC 


GGGGAC AT CG 


CAAGCCGACG 


CGATGGCGTT 


GGCCGCCGAA 


GCCGAAGCCG 


120 


CCGAAGCCGA 


AGCGCTGGCC 


GCCGCGGCGC 


GGGCCCGTGC 


CCGTGCCGCC 


CGGTTGAAGC 


180 


GTGAGGCGCT 


GGCGATGGCC 


CCAGCCGAGG 


ACGAGAACGT 


CCCCGAGGAT 


ATGCAGACTG 


240 


GGAAGACGCC 


GAAGACTATG 


ACGACTATGA 


CGACTATGAG 


GCCGCAGACC 


AGGAGGCCGC 


300 


ACGGTCGGCA 


TCCTGGCGAC 


GGCGGTTGCG 


GGTGCGGTTA 


CCAAGACTGT 


CCACGATTGC 


360 


CATGGCGGCC 


GCAGTCGTCA 


TCATCTGCGG 


CTTCACCGGG 


CTCAGCGGAT 


ACATTGTGTG 


420 


GCAACACCAT 


GAGGCCACCG 


AACGCCAGCA 


GCGCGCCGCG 


GCGTTCGCCG 


CCGGAGCCAA 


480 


GCAAGGTGTC 


ATCAACATGA 


CCTCGCTGGA 


CTTCAACAAG 


GCCAAAGAAG 


ACGTCGCGCG 


540 
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TGTGATCGAC AGCTCCACCG GCGAATTCAG GGATGACTTC CAGCAGCGGG CAGCCGATTT 600 
CACCAAGGTT GTCGAACAGT CCAAAGTGGT CACCGAAGGC ACGGTGAACG CGACAGCCGT 6 60 

CGAATCCATG AACGAGCATT CCGCCGTGGT GCTCGTCGCG GCGACTTCAC GGGTCACCAA 7 20 

TTCCGCTGGG GCGAAAGACG AACCACGTGC GTGGCGGCTC AAAGTGACCG TGACCGAAGA 7 80 

GGGGGGACAG TACAAGATGT CGAAAGTTGA GTTCGTACCG TGACCGATGA CGTACGCGAC 84 0 

GTCAACACCG AAACCACTGA CGCCACCGAA GTCGCTGAGA TCGACTCAGC CGCAGGCGAA 900 
GCCGGTGATT CGGCGACCGA GGCATTTGAC ACCGACTCTG CAACGGAATC TACCGCGCAG 960 

AAGGGTCAGC GGCACCGTGA CCTGTGGCGA ATGCAGGTTA CCTTGAAACC CGTTCCGGTG 1020 

ATTCTCATCC TGCTCATGTT GATCTCTGGG GGCGCGACGG GATGGCTATA CCTTGAGCAA 108 0 

TACGACCCGA TCAGCAGACG GACTCCGGCG CCGCCCGTGC TGCCGTCGCC GCGGCGTCTG 114 0 

ACGGGACAAT CGCGCTGTTG TGTATTCACC CGACACGTCG ACCAAGACTT CGCTACCGCC 1200 

AGGTCGCACC TCGCCGGCGA TTTCCTGTCC TATACGACCA GTTCACGCAG CAGATCGTGG 12 60 

CTCCGGCGGC CAAACAGAAG TCACTGAAAA CCACCGCCAA GGTGGTGCGC GCGGCCGTGT 1320 

CGGAGCTACA TCCGGATTCG GCCGTCGTTC TGGTTTTTGT CGACCAGAGC ACTACCAGTA 138 0 

AGGACAGCCC CAATCCGTCG ATGGCGGCCA GCAGCGTGAT GGTGACCCTA GCCAAGGTCG 14 4 0 

ACGGCAATTG GCTGATCACC AAGTTCACCC CGGTTTAGGT TGCCGTAGGC GGTCGCCAAG 1500 

TCTGACGGGG GCGCGGGTGG CTGCTCGTGC GAGATACCGG CCGTTCTCCG GACAATCACG 15 60 

GCCCGACCTC AAACAGATCT CGGCCGCTGT CTAATCGGCC GGGTTATTTA AGATTAGTTG 162 0 

CCACTGTATT TACCTGATGT TCAGATTGTT CAGCTGGATT TAGCTTCGCG GCAGGGCGGC 168 0 

TGGTGCACTT TGCATCTGGG GTTGTGACTA CTTGAGAGAA TTTGACCTGT TGCCGACGTT 174 0 

GTTTGCTGTC CATCATTGGT GCTAGTTATG GCCGAGCGGA AGGATTATCG AAGTGGTGGA 18 00 

CTTCGGGGCG TTACCACCGG AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC 18 60 

CTCGCTGGTG GCCGCCGCGA AGATGTGGGA CAGCGTGGCG AGTGACCTGT TTTCGGCCGC 1920 

GTCGGCGTTT CAGTCGGTGG TCTGGGGTCT GACGACGGGA TCGTGGATAG GTTCGTCGGC 1980 

GGGTCTGATG GTGGCGGCGG CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA 2 04 0 

GGCCGAGCTG ACCGCCGCCC AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG 2100 

GCTGACGGTG CCCCCGCCGG TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC 2160 

GACCAACCTC TTGGGGCAAA ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGGGA 2220 

GATGTGGGCC CAAGACGCCG CCGCGATGTT TGGCTACGCC GCCACGGCGG CGACGGCGAC 2280 

CGAGGCGTTG CTGCCGTTCG AGGACGCCCC ACTGATCACC AACCCCGGCG GGCTCCTTGA 234 0 

GCAGGCCGTC GCGGTCGAGG AGGCCATCGA CACCGCCGCG GCGAACCAGT TGATGAACAA 24 00 
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TGTGCCCCAA GCGCTGCAAC AACTGGCCCA GCCCACGAAA AGCATCTGGC CGTTCGACCA 2 4 60 

ACTGAGTGAA CTCTGGAAAG CCATCTCGCC GCATCTGTCG CCGCTCAGCA ACATCGTGTC 2520 

GATGCTCAAC AACCACGTGT CGATGACCAA CTCGGGTGTG TCGATGGCCA GCACCTTGCA 2 58 0 

CTCAATGTTG AAGGGCTTTG CTCCGGCGGC GGCTCAGGCC GTGGAAACCG CGGCGCAAAA 2 64 0 

CGGGGTCCAG GCGATGAGCT CGCTGGGCAG CCAGCTGGGT TCGTCGCTGG GTTCTTCGGG 2 7 00 

TCTGGGCGCT GGGGTGGCCG CCAACTTGGG TCGGGCGGCC TCGGTCGGTT CGTTGTCGGT 2760 

GCCGCAGGCC TGGGCCGCGG CCAACCAGGC GGTCACCCCG GCGGCGCGGG CGCTGCCGCT 2820 

GACCAGCCTG ACCAGCGCCG CCCAAACCGC CCCCGGACAC ATGCTGGGCG GGCTACCGCT 2880 

GGGGCAACTG ACCAATAGCG GCGGCGGGTT CGGCGGGGTT AGCAATGCGT TGCGGATGCC 2 94 0 

GCCGCGGGCG TACGTAATGC CCCGTGTGCC CGCCGCCGGG TAACGCCGAT CCGCACGCAA 3000 
TGCGGGCCCT CTATGCGGGC AGCGATC 3027 
(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Val Val Asp Phe Gly Ala Leu Pro Pro Glu lie Asn Ser Ala Arg Met 
1 5 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Thr Gly Ser Trp lie Gly Ser Ser Ala Gly 
50 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 ~ 80 

Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 . ~ 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val lie Ala 
100 105 110 

Glu Asn Arg Ala Glu Leu Met lie Leu lie Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala lie Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
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130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Thr Ala Ala 
145 150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu lie Thr 
165 170 175 

. Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala lie 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Lys Ser lie Trp Pro Phe Asp Gin Leu 
210 215 220 

Ser Glu Leu Trp Lys Ala lie Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

lie Val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Ala Ser Thr Leu His Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Gin Asn Gly Val Gin Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
2 90 ' 2 95 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly Gly Leu Pro Leu Gly Gin Leu Thr Asn 
355 360 365 

Ser Gly Gly Gly Phe Gly Gly Val Ser Asn Ala Leu Arg Met Pro Pro 
370 375 380 

Arg Ala Tyr Val Met Pro Arg Val Pro Ala Ala Gly 
385 390 395 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1616 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

CATCGGAGGG AGTGATCACC ATGCTGTGGC ACGCAATGCC ACCGGAGTAA ATACCGCACG 60 

GCTGATGGCC GGCGCGGGTC CGGCTCCAAT GCTTGCGGCG GCCGCGGGAT GGCAGACGCT 12 0 

TTCGGCGGCT CTGGACGCTC AGGCCGTCGA GTTGACCGCG CGCCTGAACT CTCTGGGAGA 180 

AGCCTGGACT GGAGGTGGCA GCGACAAGGC GCTTGCGGCT GCAACGCCGA TGGTGGTCTG 24 0 

GCTACAAACC GCGTCAACAC AGGCCAAGAC CCGTGCGATG CAGGCGACGG CGCAAGCCGC 300 

GGCATACACC CAGGCCATGG CCACGACGCC GTCGCTGCCG GAGATCGCCG CCAACCACAT 360 

CACCCAGGCC GTCCTTACGG CCACCAACTT CTTCGGTATC AACACGATCC CGATCGCGTT 4 20 

GACCGAGATG GATTATTTCA TCCGTATGTG GAACCAGGCA GCCCTGGCAA TGGAGGTCTA 4 80 

CCAGGCCGAG ACCGCGGTTA ACACGCTTTT CGAGAAGCTC GAGCCGATGG CGTCGATCCT 54 0 

TGATCCCGGC GCGAGCCAGA GCACGACGAA CCCGATCTTC GGAATGCCCT CCCCTGGCAG 600 

CTCAACACCG GTTGGCCAGT TGCCGCCGGC GGCTACCCAG ACCCTCGGCC AACTGGGTGA 660 

GATGAGCGGC CCGATGCAGC AGCTGACCCA GCCGCTGCAG CAGGTGACGT CGTTGTTCAG 720 

CCAGGTGGGC GGCACCGGCG GCGGCAACCC AGCCGACGAG GAAGCCGCGC AGATGGGCCT 7 80 

GCTCGGCACC AGTCCGCTGT CGAACCATCC GCTGGCTGGT GGATCAGGCC CCAGCGCGGG 84 0 

CGCGGGCCTG CTGCGCGCGG AGTCGCTACC TGGCGCAGGT GGGTCGTTGA CCCGCACGCC 900 

GCTGATGTCT CAGCTGATCG AAAAGCCGGT TGCCCCCTCG GTGATGCCGG CGGCTGCTGC 960 

CGGATCGTCG GCGACGGGTG GCGCCGCTCC GGTGGGTGCG GGAGCGATGG GCCAGGGTGC 1020 

GCAATCCGGC GGCTCCACCA GGCCGGGTCT GGTCGCGCCG GCACCGCTCG CGCAGGAGCG 108 0 

TGAAGAAGAC GACGAGGACG ACTGGGACGA AGAGGACGAC TGGTGAGCTC CCGTAATGAC 114 0 

AACAGACTTC CCGGCCACCC GGGCCGGAAG ACTTGCCAAC ATTTTGGCGA GGAAGGTAAA 1200 

GAGAGAAAGT AGTCCAGCAT GGCAGAGATG AAGACCGATG CCGCTACCCT CGCGCAGGAG 1260 

GCAGGTAATT TCGAGCGGAT CTCCGGCGAC CTGAAAACCC AGATCGACCA GGTGGAGTCG 1320 

ACGGCAGGTT CGTTGCAGGG CCAGTGGCGC GGCGCGGCGG GGACGGCCGC CCAGGCCGCG 1380 

GTGGTGCGCT TCCAAGAAGC AGCCAATAAG CAGAAGCAGG AACTCGACGA GATCTCGACG 14 4 0 

AATATTCGTC AGGCCGGCGT CCAATACTCG AGGGCCGACG AGGAGCAGCA GCAGGCGCTG 1500 

TCCTCGCAAA TGGGCTTCTG ACCCGCTAAT ACGAAAAGAA ACGGAGCAAA AAC AT G AC AG 1560 

AGCAGCAGTG GAATTTCGCG GGTATCGAGG CCGCGGCAAG CGCAATCCAG GGAAAT 1616 
(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



<>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

CTAGTGGATG GGACCATGGC CATTTTCTGC AGTCTCACTG CCTTCTGTGT TGACATTTTG 60 

GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA TATCGTCCGG 120 

AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TCGGCCGCCA TGACAACCTC 180 

TCAGAGTGCG CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGAA GGTCGAACTC 24 0 

GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGCGT TGCCCTATGC GAACATCCCA 300 

GTGACGTTGC CTTCGGTCGA AGCCATTGCC TGACCGGCTT CGCTGATCGT CCGCGCCAGG 3 60 

TTCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATTTTTGCTG GACACCCTGG 4 20 

TACGCCTCCG AA 4 32 
(2) I N FORMAT I ON FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID -NO: 114: 

Met Leu Trp His Ala Met Pro Pro Glu Xaa Asn Thr Ala Arg Leu Met 
1*5 10 15 

Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala Ala Ala Gly Trp Gin 
20 25 30 

Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu Thr Ala Arg 
35 4 0 4 5 

Leu Asn Ser Leu Gly Glu Ala Trp Thr Gly Gly Gly Ser Asp Lys Ala 
50 55 60 

Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu Gin Thr Ala Ser Thr 
65 70 - 75 80 

Gin Ala Lys Thr Arg Ala Met Gin Ala Thr Ala Gin Ala Ala Ala Tyr 
85 90 95 

Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro Glu lie Ala Ala Asn 
100 105 110 

His lie Thr Gin Ala Val Leu Thr Ala Thr Asn Phe Phe Gly lie Asn 
115 120 125 
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Thr He Pro He Ala Leu Thr Glu Met Asp Tyr Phe He Arg Met Trp 
130 135 140 

Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu Thr Ala Val 
145 150 155 160 

Asn Thr Leu Phe Glu Lys Leu Glu Pro Met Ala Ser He Leu Asp Pro 
165 1*70 175 

Gly Ala Ser Gin Ser Thr Thr Asn Pro He Phe Gly Met Pro Ser Pro 
180 185 190 

Gly Ser Ser Thr Pro Val Gly Gin Leu Pro Pro Ala Ala Thr Gin Thr 
195 200 205 

Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin Gin Leu Thr Gin 
210 215 220 

Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Val Gly Gly Thr Gly 
225 230 235 240 

Gly Gly Asn Pro Ala Asp Glu Glu Ala Ala Gin Met Gly Leu Leu Gly 
245 250 255 

Thr Ser Pro Leu Ser Asn His Pro Leu Ala Gly Gly Ser Gly Pro Ser 
260 265 270 

Ala Gly Ala Gly Leu Leu Arg Ala Glu Ser Leu Pro Gly Ala Gly Gly 
275 " 280 285 

Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu He Glu Lys Pro Val 
290 295 300 

Ala Pro Ser Val Met Pro Ala Ala Ala Ala Gly Ser Ser Ala Thr Gly 
305 310 315 320 

Gly Ala Ala Pro Val Gly Ala Gly Ala Met Gly Gin Gly Ala Gin Ser 
325 330 335 

Gly Gly Ser Thr Arg Pro Gly Leu Val Ala Pro Ala Pro Leu Ala Gin 
34 0 34 5 350 

Glu Arg Glu Glu Asp Asp Glu Asp Asp Trp Asp Glu Glu Asp Asp Trp 
355 * 360 365 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Met Ala Glu Met Lys Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly 
15 10 15 
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Asn Phe Glu Arg lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val 
20 25 30 

Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly 
35 ~ 4 0 "45 

Thr Ala Ala Gin Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys 
50 55 60 

Gin Lys Gin Glu Leu Asp Glu lie Ser Thr Asn lie Arg Gin Ala Gly 
65 70 75 80 

Val Gin Tyr Ser Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser 
85 90 95 

Gin Met Gly Phe 
100 

(2) INFORMATION FOR SEQ I D NO : 1 1 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

GATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGACGGCAG GTTCGTTGCA 60 

GGGCCAGTGG CGCGGCGCGG CGGGGACGGC CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 120 

AGCAGCCAAT AAGCAGAAGC AGGAACTCGA CGAGATCTCG ACGAATATTC GTCAGGCCGG 180 

CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGCG CTGTCCTCGC AAATGGGCTT 24 0 

CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAAC AT G A CAGAGCAGCA GTGGAATTTC 300 

GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 3 60 

CTTGACGAGG GGAAGCAGTC CCTGACCAAG CTCGCA 3 96 
(2) INFORMATION FOR SEQ ID NO: 117: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val Glu Ser Thr Ala 
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1 



5 



10 



15 



Gly Ser Leu Gin Gly Gin Trp Arg 
20 



Gly Ala Ala Gly Thr Ala Ala Gin 
25 30 



Ala Ala Val Val Arg Phe Gin Glu 
35 40 



Ala Ala Asn Lys Gin Lys Gin Glu 
45 



Leu Asp Glu lie Ser Thr Asn lie 
50 55 



Arg Gin Ala Gly Val Gin Tyr Ser 
60 



Arg Ala Asp Glu Glu Gin Gin Gin 
65 70 



Ala Leu Ser Ser Gin Met Gly Phe 
7 5 80 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

GTGGATCCCG ATCCCGTGTT TCGCTATTCT ACGCGAACTC GGCGTTGCCC TATGCGAACA 60 

TCCCAGTGAC GTTGCCTTCG GTCGAAGCCA TTGCCTGACC GGCTTCGCTG ATCGTCCGCG 120 

CCAGGTTCTG CAGCGCGTTG TTCAGCTCGG TAGCCGTGGC GTCCCATTTT TGCTGGACAC 180 

CCTGGTACGC CTCCGAACCG CTACCGCCCC AGGCCGCTGC GAGCTTGGTC AGGGACTGCT 24 0 

TCCCCTCGTC AAGGAGGGAA TGAATGGACG TGACATTTCC CTGGATTGCG CTTGCCGCGG 300 

CCTCGATACC CGCGAAATTC CACTGCTGCT CTGTCATGTT TTTGCTCCGT TTCTTTTCGT 3 60 

ATTAGCGGGT CAGAAGCCCA TTTGCGA 387 
(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

CGGCACGAGG ATCTCGGTTG GCCCAACGGC GCTGGCGAGG GCTCCGTTCC GGGGGCGAGC 60 

TGCGCGCCGG ATGCTTCCTC TGCCCGCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACC 120 

TTCCCGACGT TTCGTTCGGT GTCTGTGCGA TAGCGGTGAC CCCGGCGCGC ACGTCGGGAG 180 
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TGTTGGGGGG CAGGCCGGGT CGGTGGTTCG GCCGGGGACG CAGACGGTCT GGACGGAACG 2 40 

GGCGGGGGTT CGCCGATTGG CATCTTTGCC CA 2 72 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Asp Pro Val Asp Ala Val lie Asn Thr Thr Cys Asn Tyr Gly Gin Val 
1 5 10 15 

Val Ala Ala Leu 

20 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
1 5 10 15 

Glu Gly Arg 
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(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Tvr Tvr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
1 y F 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val Pro 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 12 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro 
15 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO:128: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
15 10 15 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 12 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Asp Pro Pro Asp Pro His Gin Xaa Asp Met Thr Lys Gly Tyr Tyr Pro 
1 5 10 15 
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Gly Gly Arg Arg Xaa Phe 
20 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Asp Pro Gly Tyr Thr Pro Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(D) OTHER INFORMATION: /note= "The Second Residue Can Be Either a 
Pro or Thr" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:131: 

Xaa Xaa Gly Phe Thr Gly Pro Gin Phe Tyr 
1 "* 5 10 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D) TOPOLOGY : linear 



(ix) FEATURE: 

(D) OTHER INFORMATION: /note= "The- Third Residue Can Be Either a 

Gin or Leu" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Xaa Pro Xaa Val Thr Ala Tyr Ala Gly 
1 5 

(2) INFORMATION FOR SEQ. ID NO: 133: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS: 

<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

Xaa Xaa Xaa Glu Lys Pro Phe Leu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Xaa Asp Ser Glu Lys Ser Ala Thr lie Lys Val Thr Asp Ala Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Ala Gly Asp Thr Xaa lie Tyr lie Val Gly Asn Leu Thr Ala Asp 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Ala Pro Glu Ser Gly Ala Gly Leu Gly Gly Thr Val Gin Ala Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Xaa Tyr lie Ala Tyr Xaa Thr Thr Ala Gly lie Val Pro Gly Lys lie 
15 10 15 

Asn Val His Leu Val 
20 

(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 



GCAACGCTGT 


CGTGGCCTTT 


GCGGTGATCG 


GTTTCGCCTC 


GCTGGCGGTG 


GCGGTGGCGG 


60 


TCACCATCCG 


ACCGACCGCG 


GCCTCAAAAC 


CGGTAGAGGG 


ACACCAAAAC 


GCCCAGCCAG 


120 


GGAAGTTCAT 


GCCGTTGTTG 


CCGACGCAAC 


AGCAGGCGCC 


GGTCCCGCCG 


CCTCCGCCCG 


180 


ATGATCCCAC 


CGCTGGATTC 


CAGGGCGGCA 


CCATTCCGGC 


TGTACAGAAC 


GTGGTGCCGC 


240 


GGCCGGGTAC 


CTCACCCGGG 


GTGGGTGGGA 


CGCCGGCTTC 


GCCTGCGCCG 


GAAGCGCCGG 


300 


CCGTGCCCGG 


TGTTGTGCCT 


GCCCCGGTGC 


CAATCCCGGT 


CCCGATCATC 


ATTCCCCCGT 


. 360 


TCCCGGGTTG 


GCAGCCTGGA 


ATGCCGACCA 


TCCCCACCGC 


AGCGCCGACG 


ACGCCGGTGA 


420 


CCACGTCGGC 


GACGACGCCG 


CCGACCACGC 


CGCCGACCAC 


GCCGGTGACC 


ACGCCGCCAA 


480 


CGACGCCGCC 


GACCACGCCG 


GTGACCACGC 


CGCCAACGAC 


GCCGCCGACC 


ACGCCGGTGA 


540 


CCACGCCACC 


AACGACCGTC 


GCCCCGACGA 


CCGTCGCCCC 


GACGACGGTC 


GCTCCGACCA 


600 


CCGTCGCCCC 


GACCACGGTC 


GCTCCAGCCA 


CCGCCACGCC 


GACGACCGTC 


GCTCCGCAGC 


660 
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CGACGCAGCA GCCCACGCAA CAACCAACCC AACAGATGCC AACCCAGCAG CAGACCGTGG 



720 



CCCCGCAGAC GGTGGCGCCG GCTCCGCAGC CGCCGTCCGG TGGCCGCAAC GGCAGCGGCG 



780 



GGGGCGACTT ATTCGGCGGG TTCTGATCAC GGTCGCGGCT TCACTACGGT CGGAGGACAT 



840 



GGCCGGTGAT GCGGTGACGG TGGTGCTGCC CTGTCTCAAC GA 



882 



(2) INFORMATION FOR SEQ ID NO: 139: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

CCATCAACCA ACCGCTCGCG CCGCCCGCGC CGCCGGATCC GCCGTCGCCG CCACGCCCGC 60 

CGGTGCCTCC GGTGCCCCCG TTGCCGCCGT CGCCGCCGTC GCCGCCGACC GGCTGGGTGC 120 

CTAGGGCGCT GTTACCGCCC TGGTTGGCGG GGACGCCGCC GGCACCACCG GTACCGCCGA 180 

TGGCGCCGTT GCCGCCGGCG GCACCGTTGC CACCGTTGCC ACCGTTGCCA CCGTTGCCGA 24 0 

CCAGCCACCC GCCGCGACCA CCGGCACCGC CGGCGCCGCC CGCACCGCCG GCGTGCCCGT 300 

TCGTGCCCGT ACCGCCGGCA CCGCCGTTGC CGCCGTCACC GCCGACGGAA CTACCGGCGG 3 60 

ACGCGGCCTG CCCGCCGGCG CCGCCCGCAC CGCCATTGGC ACCGCCGTCA CCGCCGGCTG 4 20 

GGAGTGCCGC GATTAGGGCA CTGACCGGCG CAACCAGCGC AAGTACTCTC GGTCACCGAG 4 80 

CACTTCCAGA CGACACCACA GCACGGGGTT GTCGGCGGAC TGGGTGAAAT GGCAGCCGAT 54 0 

AGCGGCTAGC TGTCGGCTGC GGTCAACCTC GATCATGATG TCGAGGTGAC CGTGACCGCG 600 

CCCCCCGAAG GAGGCGCTGA ACTCGGCGTT GAGCCGATCG GCGATCGGTT GGGGCAGTGC 660 

CCAGGCCAAT ACGGGGATAC CGGGTGTCNA AGCCGCCGCG AGCGCAGCTT CGGTTGCGCG 720 

ACNGTGGTCG GGGTGGCCTG TTACGCCGTT GTCNTCGAAC ACGAGTAGCA GGTCTGCTCC 780 

GGCGAGGGCA TCCACCACGC GTTGCGTCAG CTCGT 815 
(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 



ACCAGCCGCC 


GGCTGAGGTC 


TCAGATCAGA 


GAGTCTCCGG 


ACTCACCGGG 


GCGGTTCAGC 


60 


CTTCTCCCAG 


AACAACTGCT 


GAAGATCCTC 


GCCCGCGAAA 


CAGGCGCTGA 


TTTGACGCTC 


120 


TATGACCGGT 


TGAACGACGA 


GATCATCCGG 


CAGATTGATA 


TGGCACCGCT 


GGGCTAACAG 


180 


GTGCGCAAGA 


TGGTGCAGCT 


GTATGTCTCG 


GACTCCGTGT 


CGCGGATCAG 


CTTTGCCGAC 


240 


GGCCGGGTGA 


TCGTGTGGAG 


CGAGGAGCTC 


GGCGAGAGCC 


AGTATCCGAT 


CGAGACGCTG 


300 


GACGG CATC A 


CGCTGTTTGG 


GCGGCCGACG 


ATGACAACGC 


CCTTCATCGT 


TGAGATGCTC 


360 


AAGCGTGAGC 


GCGACATCCA 


GCTCTTCACG 


ACCGACGGCC 


ACTACCAGGG 


CCGGATCTCA 


420 


ACACCCGACG 


TGTCATACGC 


GCCGCGGCTC 


CGTCAGCAAG 


TTCACCGCAC 


CGACGATCCT 


480 


GCGTTCTGCC 


TGTCGTTAAG 


CAAGCGGATC 


GTGTCGAGGA 


AGATCCTGAA 


TCAGCAGGCC 


540 


TTGATTCGGG 


CACACACGTC 


GGGGCAAGAC 


GTTGCTGAGA 


GCATCCGCAC 


GATGAAGCAC 


600 


TCGCTGGCCT 


GGGTCGATCG 


ATCGGGCTCC 


CTGGCGGAGT 


TGAACGGGTT 


CGAGGGAAAT 


660 


GCCGCAAAGG 


CATACTTCAC 


CGCGCTGGGG 


CATCTCGTCC 


CGCAGGAGTT 


CGCATTCCAG 


720 


GGCCGCTCGA 


CTCGGCCGCC 


GTTGGACGCC 


TTCAACTCGA 


TGGTCAGCCT 


CGGCTATTCG 


780 


CTGCTGTACA 


AGAACATCAT 


AGGGGCGATC 


GAGCGTCACA 


GCCTGAACGC 


GTATATCGGT 


840 


TTCCTACACC 


AGGATTCACG 


AGGGCACGCA 


ACGTCTCGTG 


CCGAATTCGG 


CACGAGCTCC 


900 


GCTGAAACCG 


CTGGCCGGCT 


GCTCAGTGCC 


CGTACGTAAT 


CCGCTGCGCC 


CAGGCCGGCC 


960 


CGCCGGCCGA 


ATACCAGCAG 


ATCGGACAGC 


GAATTGCCGC 


CCAGCCGGTT 


GGAGCCGTGC 


1020 


ATACCGCCGG 


CACACTCACC 


GGCAGCGAAC 


AGGCCTGGCA 


CCGTGGCGGC 


GCCGGTGTCC 


1080 


GCGTCTACTT 


CGACACCGCC 


CATCACGTAG 


TGACACGTCG 


GCCCGACTTC 


CATTGCCTGC 


1140 


GTTCGGCACG 


AG 










1152 



(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
CTCGTGCCGA TTCGGCAGGG TGTACTTGCC GGTGGTGTAN GCCGCATGAG TGCCGACGAC 60 
CAGCAATGCG GCAACAGCAC GGATCCCGGT CAACGACGCC ACCCGGTCCA CGTGGGCGAT 120 
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CCGCTCGAGT CCGCCCTGGG CGGCTCTTTC CTTGGGCAGG GTCATCCGAC GTGTTTCCGC 180 

CGTGGTTTGC CGCCATTATG CCGGCGCGCC GCGTCGGGCG GCCGGTATGG CCGAANGTCG 24 0 

ATCAGCACAC CCGAGATACG GGTCTGTGCA AGCTTTTTGA GCGTCGCGCG GGGCAGCTTC 300 

GCCGGCAATT CTACTAGCGA GAAGTCTGGC CCGATACGGA TCTGACCGAA GTCGCTGCGG 3 60 

TGCAGCCCAC CCTCATTGGC GATGGCGCCG ACGATGGCGC CTGGACCGAT CTTGTGCCGC 4 20 

TTGCCGACGG CGACGCGGTA GGTGGTCAAG TCCGGTCTAC GCTTGGGCCT TTGCGGACGG 4 80 

TCCCGACGCT GGTCGCGGTT GCGCCGCGAA AGCGGCGGGT CGGGTGCCAT CAGGAATGCC 54 0 

TCACCGCCGC GGCACTGCAC GGCCAGTGCC GCGGCGATGT CAGCCATCGG GACATCATGC 600 

TCGCGTTCAT ACTCCTCGAC CAGTCGGCGG AACAGCTCGA TTCCCGGACC GCCCA 65 5 

(2) INFORMATION FOR SEQ ID NO: 14 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 2: 

Asn Ala Val Val Ala Phe Ala Val He Gly Phe Ala Ser Leu Ala Val 
15 10 15 

Ala Val Ala Val Thr He Arg Pro Thr Ala Ala Ser Lys Pro Val Glu 
20 25 30 

Gly His Gin Asn Ala Gin Pro Gly Lys Phe Met Pro Leu Leu Pro Thr 
35 40 45 

Gin Gin Gin Ala Pro Val Pro Pro Pro Pro Pro Asp Asp Pro Thr Ala 
50 55 60 

Gly Phe Gin Gly Gly Thr He Pro Ala Val Gin Asn Val Val Pro Arg 
65 70 75 80 

Pro Gly Thr Ser Pro Gly Val Gly Gly Thr Pro Ala Ser Pro Ala Pro 
85 90 95 

Glu Ala Pro Ala Val Pro Gly Val Val Pro Ala Pro Val Pro He Pro 
100 105 110 

Val Pro He He He Pro Pro Phe Pro Gly Trp Gin Pro Gly Met Pro 
115 120 125 

Thr He Pro Thr Ala Pro Pro Thr Thr Pro Val Thr Thr Ser Ala Thr 
130 135 140 

Thr Pro Pro Thr Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr 
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145 150 155 160 

Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr Thr Pro Pro Thr 
165 170 175 

Thr Pro Val Thr Thr Pro Pro Thr Thr Val Ala Pro Thr Thr Val Ala 
180 185 190 

Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro 
195 200 205 

Ala Thr Ala Thr Pro Thr Thr Val Ala Pro Gin Pro Thr Gin Gin Pro 
210 215 220 

Thr Gin Gin Pro Thr Gin Gin Met Pro Thr Gin Gin Gin Thr Val Ala 
225 230 235 240 

Pro Gin Thr Val Ala Pro Ala Pro Gin Pro Pro Ser Gly Gly Arg Asn 
245 250 255 

Gly Ser Gly Gly Gly Asp Leu Phe Gly Gly Phe 
2 60 2 65 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DEI 

lie Asn Gin Pro 

1 • 

Pro Arg Pro Pro 
20 

Ser Pro Pro Thr 
35 

Ala Gly Thr Pro 
50 

Pro Ala Ala Pro 
65 

Ser His Pro Pro 



Ala Cys Pro Phe 
100 

Pro Pro Thr Glu 
115 



CRIPTION: SEQ I 

Leu Ala Pro Pro 
5 

Val Pro Pro Val 



Gly Trp Val Pro 
4 0 

Pro Ala Pro Pro 
55 

Leu Pro Pro Leu 
70 

Arg Pro Pro Ala 
85 

Val Pro Val Pro 



Leu Pro Ala Asp 
120 



) NO:143: 

Ala Pro Pro Asp 
10 

Pro Pro Leu Pro 
25 

Arg Ala Leu Leu 



Val Pro Pro Met 
60 

Pro Pro Leu Pro 
75 

Pro Pro Ala Pro 
90 

Pro Ala Pro Pro 
105 

Ala Ala Cys Pro 



Pro Pro Ser Pro 
15 

Pro Ser Pro Pro 
30 

Pro Pro Trp Leu 
45 

Ala Pro Leu Pro 



Pro Leu Pro Thr 
80 

Pro Ala Pro Pro 
95 

Leu Pro Pro Ser 
110 

Pro Ala Pro Pro 
125 
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Ala Pro Pro Leu Ala Pro Pro Ser Pro Pro Ala Gly Ser Ala Ala lie 
130 135 140 

Arg Ala Leu Thr Gly Ala Thr Ser Ala Ser Thr Leu Gly His Arg Ala 
145 - 150 155 160 

Leu Pro Asp Asp Thr Thr Ala Arq Gly Cys Arg Arg Thr Gly 
165 170 

(2) I N FORMAT I ON FOR SEQ ID NO: 14 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Gin Pro Pro Ala Glu Val Ser Asp Gin Arg Val Ser Gly Leu Thr Gly 
15 10 15 

Ala Val Gin Pro Ser Pro Arg Thr Thr Ala Glu Asp Pro Arg Pro Arg 
20 25 30 

Asn Arg Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 14 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



.(.xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Arg Ala Asp Ser Ala Gly Cys Thr Cys Arg Trp Cys Xaa Pro His Glu 
• 1 5 10 15 

Cys Arg Arg Pro Ala Met Arg Gin Gin His Gly Ser Arg Ser Thr Thr 
20 25 30 

Pro Pro Gly Pro Arg Gly Arg Ser Ala Arg Val Arg Pro Gly Arg Leu 
35 - 40 45 



Phe Pro Trp Ala Gly Ser Ser Asp Val Phe Pro Pro Trp Phe Ala Ala 

50 55 60 

lie Met Pro Ala Arg Arg Val Gly Arg Pro Val Trp Pro Xaa Val Asp 

65 70 75 80 
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Gin His Thr Arg Asp Thr Gly Leu Cys Lys Leu Phe Glu Arg Arg Ala 
85 * 90 95 

Gly Gin Leu Arg Arg Gin Phe Tyr 
100 

(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 6: 
GGATCCATAT GGGCCATCAT CAT CAT CATC ACGTGATCGA CATCATCGGG ACC 53 
(2) INFORMATION FOR SEQ ID NO: 14 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 
CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 
(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



BNSDOCID: <WO 9816646A2JA> 



WO 98/16646 



153 



PCTYUS97/18293 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 31 
(2) INFORMATION FOR SEQ ID NO: 14 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi). SEQUENCE DESCRIPTION: SEQ ID NO : 1 4 9 : 



CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 
(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 
GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 33 
(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 
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GAGAGAATTC TCAGAAGCCC ATTTGCGAGG ACA 33 
(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 152.. 1273 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 120 

GCGGAAATTG AAGAGCACAG AAAGGTATGG C GTG AAA ATT CGT TTG CAT ACG 172 

Val Lys lie Arg Leu His Thr 
1 5 

CTG TTG GCC GTG TTG ACC GCT GCG CCG CTG CTG CTA GCA GCG GCG GGC 220 
Leu Leu Ala Val Leu Thr Ala Ala Pro Leu Leu Leu Ala Ala Ala Gly 
10 15 20 

TGT GGC TCG AAA CCA CCG AGC GGT TCG CCT GAA ACG GGC GCC GGC GCC 2 68 

Cys Gly Ser Lys Pro Pro Ser Gly Ser Pro Glu Thr Gly Ala Gly Ala 
25 " 30 35 

GGT ACT GTC GCG ACT ACC CCC GCG TCG TCG CCG GTG ACG TTG GCG GAG 316 
Gly Thr Val Ala Thr Thr Pro Ala Ser Ser Pro Val Thr Leu Ala Glu 
40 45 50 55 

ACC GGT AGC ACG CTG CTC TAC CCG CTG TTC AAC CTG TGG GGT CCG GCC 364 
Thr Gly Ser Thr Leu Leu Tyr Pro Leu Phe Asn Leu Trp Gly Pro Ala 
60 65 70 

TTT CAC GAG AGG TAT CCG AAC GTC ACG ATC ACC GCT CAG GGC ACC GGT 412 
Phe His Glu Arg Tyr Pro Asn Val Thr He Thr Ala Gin Gly Thr Gly 
75 80 85 

TCT GGT GCC GGG ATC GCG CAG GCC GCC GCC GGG ACG GTC AAC ATT GGG 4 60 

Ser Gly Ala Gly He Ala Gin Ala Ala Ala Gly Thr Val Asn He Gly 
90 " 95 100 

GCC TCC GAC GCC TAT CTG TCG GAA GGT GAT ATG GCC GCG CAC AAG GGG 508 
Ala Ser Asp Ala Tyr Leu Ser Glu Gly Asp Met Ala Ala His Lys Gly 
105 " HO H5 

CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT CAG CAG GTC AAC TAC AAC 556 
Leu Met Asn He Ala Leu Ala He Ser Ala Gin Gin Val Asn Tyr Asn 
120 125 130 135 
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CTG CCC GGA GTG AGC GAG CAC CTC AAG CTG AAC GGA AAA GTC CTG GCG 604 

Leu Pro Gly Val Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 

140 145 150 

GCC ATG TAC CAG GGC ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 652 

Ala Met Tyr Gin Gly Thr lie Lys Thr Trp Asp Asp Pro Gin lie Ala 

155 160 165 

GCG CTC AAC CCC GGC GTG AAC CTG CCC GGC ACC GCG GTA GTT CCG CTG 7 00 

Ala Leu Asn Pro Gly Val Asn Leu Pro Gly Thr Ala Val Val Pro Leu 

170 175 180 

CAC CGC TCC GAC GGG TCC GGT GAC ACC TTC TTG TTC ACC CAG TAC CTG 74 8 

His Arg Ser Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Gin Tyr Leu 

185 " 190 4 195 

TCC AAG CAA GAT CCC GAG GGC TGG GGC AAG TCG CCC GGC TTC GGC ACC 7 96 

Ser Lys Gin Asp Pro Glu Gly Trp Gly Lys Ser Pro Gly Phe Gly Thr 

200 205 210 " 215 

ACC GTC GAC TTC CCG GCG GTG CCG GGT GCG CTG GGT GAG AAC GGC AAC 84 4 

Thr Val Asp Phe Pro Ala Val Pro Gly Ala Leu Gly Glu Asn Gly Asn 

220 225 230 

GGC GGC ATG GTG ACC GGT TGC GCC GAG ACA CCG GGC TGC GTG GCC TAT 8 92 

Gly Gly Met Val Thr Gly Cys Ala Glu Thr Pro Gly Cys Val Ala Tyr 

235 ' 24 0 " 24 5 

ATC GGC ATC AGC TTC CTC GAC CAG GCC AGT CAA CGG GGA CTC GGC GAG 94 0 

lie Gly lie Ser Phe Leu Asp Gin Ala Ser Gin Arg Gly Leu Gly Glu 

250 255 260 

GCC CAA CTA GGC AAT AGC TCT GGC AAT TTC TTG TTG CCC GAC GCG CAA 98 8 

Ala Gin Leu Gly Asn Ser Ser Gly Asn Phe Leu Leu Pro Asp Ala Gin 

265 270 275 

AGC ATT CAG GCC GCG GCG GCT GGC TTC GCA TCG AAA ACC CCG GCG AAC 103 6 

Ser lie Gin Ala Ala Ala Ala Gly Phe Ala Ser Lys Thr Pro Ala Asn 

280 285 290 295 

CAG GCG ATT TCG ATG ATC GAC GGG CCC GCC CCG GAC GGC TAC CCG ATC 1084 

Gin Ala lie Ser Met lie Asp Gly Pro Ala Pro Asp Gly Tyr Pro lie 

300 305 ~ " 310 

ATC AAC TAC GAG TAC GCC ATC GTC AAC AAC CGG CAA AAG GAC GCC GCC 1132 

lie Asn Tyr Glu Tyr Ala lie Val Asn Asn Arg Gin Lys Asp Ala Ala 

315 320 325 

ACC GCG CAG ACC TTG CAG GCA TTT CTG CAC TGG GCG ATC ACC GAC GGC 1180 

Thr Ala Gin Thr Leu Gin Ala Phe Leu His Trp Ala lie Thr Asp Gly 

330 335 340 

AAC AAG GCC TCG TTC CTC GAC CAG GTT CAT TTC CAG CCG CTG .CCG CCC 1228 

Asn Lys Ala Ser Phe Leu Asp Gin Val His Phe Gin Pro Leu Pro Pro 

345 350 355 

GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC 1273 

Ala Val Val Lys Leu Ser Asp Ala Leu lie Ala Thr lie Ser Ser 

360 365 370 

TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG CCATCGGGCT GCTTTGCGGA 1333 
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GCATGCTGGC CCGTGCCGGT GAAGTCGGCC GCGCTGGCCC GGCCATCCGG TGGTTGGGTG 13 93 

GGATAGGTGC GGTGATCCCG CTGCTTGCGC TGGTCTTGGT GCTGGTGGTG CTGGTCATCG 14 53 

AGGCGATGGG TGCGATCAGG CTCAACGGGT TGCATTTCTT CACCGCCACC GAATGGAATC 1513 

CAGGCAACAC CTACGGCGAA ACCGTTGTCA CCGACGCGTC GGCCATCCGG TCGGCGCCTA 157 3 

CTACGGGGCG TTGCCGCTGA TCGTCGGGAC GCTGGCGACC TCGGCAATCG CCCTGATCAT 1633 

CGCGGTGCCG GTCTCTGTAG GAGCGGCGCT GGTGATCGTG GAACGGCTGC CGAAACGGTT 1693 

GGCCGAGGCT GTGGGAATAG TCCTGGAATT GCTCGCCGGA ATCCCCAGCG TGGTCGTCGG 17 53 

TTTGTGGGGG GCAATGACGT TCGGGCCGTT CATCGCTCAT CACATCGCTC CGGTGATCGC 1813 

TCACAACGCT CCCGATGTGC CGGTGCTGAA CTACTTGCGC GGCGACCCGG GCAACGGGGA 1873 

GGGCATGTTG GTGTCCGGTC TGGTGTTGGC GGTGATGGTC GTTCCCATTA TCGCCACCAC 1933 

CACTCATGAC CTGTTCCGGC AGGTGCCGGT GTTGCCCCGG GAGGGCGCGA TCGGGAATTC 19 93 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Val Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
1 5 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 80 

lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly lie Ala Gin Ala Ala 
85 90 - 95 

Ala Gly Thr Val Asn lie Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met Ala Ala His Lys Gly Leu Met Asn lie Ala Leu Ala He Ser 
115 " 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 . 135 140 
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Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 

He Ala Thr lie Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 
TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 



BNSOOCID: <WO 9816646A2JA> 



WO 98/16646 



PCI7US97/18293 



AGCATGCGGA 


AACCGCCCGA 


TACGTCGCCG 


GACTGTCGGG 


GGACGTCAAG 


GACGCCAAGC 


120 


GCGGAAATTG 


AAGAGCACAG 


AAAGGTATGG 


CGTGAAAATT 


CGTTTGCATA 


CGCTGTTGGC 


180 


CGTGTTGACC 


GCTGCGCCGC 


TGCTGCTAGC 


AGCGGCGGGC 


TGTGGCTCGA 


AACCACCGAG 


240 


CGGTTCGCCT 


GAAACGGGCG 


CCGGCGCCGG 


TACTGTCGCG 


ACTACCCCCG 


CGTCGTCGCC 


300 


GGTGACGTTG 


GCGGAGACCG 


GTAGCACGCT 


GCTCTACCCG 


CTGTTCAACC 


TGTGGGGTCC 


360 


GGCCTTTCAC 


GAGAGGTATC 


CGAACGTCAC 


GATCACCGCT 


CAGGGCACCG 


GTTCTGGTGC 


420 


CGGGATCGCG 


CAGGCCGCCG 


CCGGGACGGT 


CAACATTGGG 


GCCTCCGACG 


CCTATCTGTC 


480 


GGAAGGTGAT 


ATGGCCGCGC 


ACAAGGGGCT 


GATGAACATC 


GCGCTAGCCA 


TCTCCGCTCA 


540 


GCAGGTCAAC 


TACAACCTGC 


CCGGAGTGAG 


CGAGCACCTC 


AAGCTGAACG 


GAAAAGTCCT 


600 


GGCGGCCATG 


TACCAGGGCA 


CCATCAAAAC 


CTGGGACGAC 


CCGCAGATCG 


CTGCGCTCAA 


660 


CCCCGGCGTG 


AACCTGCCCG 


GCACCGCGGT 


AGTTCCGCTG 


CACCGCTCCG 


ACGGGTCCGG 


720 


TGACACCTTC 


TTGTTCACCC 


AGTACCTGTC 


CAAGCAAGAT 


CCCGAGGGCT 


GGGGCAAGTC 


780 


GCCCGGCTTC 


GGCACCACCG 


TCGACTTCCC 


GGCGGTGCCG 


GGTGCGCTGG 


GTGAGAACGG 


840 


CAACGGCGGC 


ATGGTGACCG 


GTTGCGCCGA 


GACACCGGGC 


TGCGTGGCCT 


AT AT C G G CAT 


900 


CAGCTTCCTC 


GACCAGGCCA 


GTCAACGGGG 


ACTCGGCGAG 


GCCCAACTAG 


GCAATAGCTC 


960 


TGGCAATTTC 


TTGTTGCCCG 


ACGCGCAAAG 


CATTCAGGCC 


GCGGCGGCTG 


GCTTCGCATC 


1020 


GAAAACCCCG 


GCGAACCAGG 


CGATTTCGAT 


GATCGACGGG 


CCCGCCCCGG 


ACGGCTACCC 


1080 


GATCATCAAC 


TACGAGTACG 


CCATCGTCAA 


CAACCGGCAA 


AAGGACGCCG 


CCACCGCGCA 


1140 


GACCTTGCAG 


GCATTTCTGC 


ACTGGGCGAT 


CACCGACGGC 


AACAAGGCCT 


CGTTCCTCGA 


1200 


CCAGGTTCAT 


TTCCAGCCGC 


TGCCGCCCGC 


GGTGGTGAAG 


TTGTCTGACG 


CGTTGATCGC 


1260 


GACGATTTCC 


AGCTAGCCTC 


GTTGACCACC 


ACGCGACAGC 


AACCTCCGTC 


GGGCCATCGG 


1320 


GCTGCTTTGC 


GGAGCATGCT 


GGCCCGTGCC 


GGTGAAGTCG 


GCCGCGCTGG 


CCCGGCCATC 


1380 


CGGTGGTTGG 


GTGGGATAGG 


TGCGGTGATC 


CCGCTGCTTG 


CGCTGGTCTT 


GGTGCTGGTG 


1440 


GTGCTGGTCA 


TCGAGGCGAT 


GGGTGCGATC 


AGGCTCAACG 


GGTTGCATTT 


CTTCACCGCC 


1500 


ACCGAATGGA 


ATCCAGGCAA 


CACCTACGGC 


GAAACCGTTG 


TCACCGACGC 


GTCGCCCATC 


1560 


CGGTCGGCGC 


CTACTACGGG 


GCGTTGCCGC 


TGATCGTCGG 


GACGCTGGCG 


ACCTCGGCAA 


1620 


TCGCCCTGAT 


CATCGCGGTG 


CCGGTCTCTG 


TAGGAGCGGC 


GCTGGTGATC 


GTGGAACGGC 


1680 


TGCCGAAACG 


GTTGGCCGAG 


GCTGTGGGAA 


TAGTCCTGGA 


ATTGCTCGCC 


GGAATCCCCA 


1740 


GCGTGGTCGT 


CGGTTTGTGG 


GGGGCAATGA 


CGTTCGGGCC 


GTTCATCGCT 


CATCACATCG 


1800 


CTCCGGTGAT 


CGCTCACAAC 


GCTCCCGATG 


TGCCGGTGCT 


GAACTACTTG 


CGCGGCGACC 


1860 


CGGGCAACGG 


GGAGGGCATG 


TTGGTGTCCG 


GTCTGGTGTT 


GGCGGTGATG 


GTCGTTCCCA 


1920 
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TTATCGCCAC CACCACTCAT GACCTGTTCC GGCAGGTGCC GGTGTTGCCC CGGGAGGGCG 1980 
CGATCGGGAA TTC 19 93 

(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Met Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 80 

lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 HO 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin lie Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 ' J 235 ' 240 
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Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arq Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro lie He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 

Asn Arq Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
34 0 34 5 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 

He Ala Thr He Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 15 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

GGTCTTGACC ACCACCTGGG TGTCGAAGTC GGTGCCCGGA TTGAAGTCCA GGTACTCGTG 60 

GGTGGGGCGG GCGAAACAAT AGCGACAAGC ATGCGAGCAG CCGCGGTAGC CGTTGACGGT 120 

GTAGCGAAAC GGCAACGCGG CCGCGTTGGG CACCTTGTTC AGCGCTGATT TGCACAACAC 180 

CTCGTGGAAG GTGATGCCGT CGAATTGTGG CGCGCGAACG CTGCGGACCA GGCCGATCCG 24 0 

CTGCAACCCG GCAGCGCCCG TCGTCAACGG GCATCCCGTT CACCGCGACG GCTTGCCGGG 300 

CCCAACGCAT ACCATTATTC GAACAACCGT TCTATACTTT GTCAACGCTG GCCGCTACCG 360 

AGCGCCGCAC AGGATGTGAT ATGCCATCTC TGCCCGCACA G AC AG GAG C C AGGCCTTATG 4 20 

ACAGCATTCG GCGTCGAGCC CTACGGGCAG CCGAAGTACC TAGAAATCGC CGGGAAGCGC 480 

ATGGCGTATA TCGACGAAGG CAAGGGTGAC GCCATCGTCT TTCAGCACGG CAACCCCACG 54 0 

TCGTCTTACT TGTGGCGCAA CATCATGCCG CACTTGGAAG GGCTGGGCCG GCTGGTGGCC 600 
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TGCGATCTGA 


TCGGGATGGG 


CGCGTCGGAC 


AAGCTCAGCC 


CATCGGGACC 


CGACCGCTAT 


660 


AGCTATGGCG 


AGCAACGAGA 


CTTTTTGTTC 


GCGCTCTGGG 


ATGCGCTCGA 


CCTCGGCGAC 


720 


CACGTGGTAC 


TGGTGCTGCA 


CGACTGGGGC 


TCGGCGCTCG 


GCTTCGACTG 


GGCTAACCAG 


780 


CATCGCGACC 


GAGTGCAGGG 


GATCGCGTTC 


ATGGAAGCGA 


TCGTCACCCC 


GATGACGTGG 


840 


GCGGACTGGC 


CGCCGGCCGT 


GCGGGGTGTG 


TTCCAGGGTT 


TCCGATCGCC 


TCAAGGCGAG 


900 


CCAATGGCGT 


TGGAGCACAA 


CATCTTTGTC 


GAACGGGTGC 


TGCCCGGGGC 


GATCCTGCGA 


960 


CAGCTCAGCG 


ACGAGGAAAT 


GAACCACTAT 


CGGCGGCCAT 


TCGTGAACGG 


CGGCGAGGAC 


1020 


CGTCGCCCCA 


CGTTGTCGTG 


GCCACGAAAC 


CTTCCAATCG 


ACGGTGAGCC 


CGCCGAGGTC 


1080 


GTCGCGTTGG 


TCAACGAGTA 


CCGGAGCTGG 


CTCGAGGAAA 


CCGACATGCC 


GAAACTGTTC 


1140 


ATCAACGCCG 


AGCCCGGCGC 


GATCATCACC 


GGCCGCATCC 


GTGACTATGT 


CAGGAGCTGG 


1200 


CCCAACCAGA 


CCGAAATCAC 


AGTGCCCGGC 


GTGCATTTCG 


TTCAGGAGGA 


CAGCGATGGC 


1260 


GTCGTATCGT 


GGGCGGGCGC 


TCGGCAGCAT 


CGGCGACCTG 


GGAGCGCTCT 


CATTTCACGA 


1320 


GACCAAGAAT 


GTGATTTCCG 


GCGAAGGCGG 


CGCCCTGCTT 


GTCAACTCAT 


AAGACTTCCT 


1380 


GCTCCGGGCA 


GAGATTCTCA 


GGGAAAAGGG 


CACCAATCGC 


AGCCGCTTCC 


TTCGCAACGA 


1440 


GGTCGACAAA 


TATACGTGGC 


AGGACAAAGG 


TCTTCCTATT 


TGCCCAGCGA 


ATTAGTCGCT 


1500 


GCCTTTCTAT 


GGGCTCAGTT 


CGAGGAAGCC 


GAGCGGATCA 


CGCGTATCCG 


ATTGGACCTA 


1560 


TGGAACCGGT 


ATCATGAAAG 


CTTCGAATCA 


TTGGAACAGC 


GGGGGCTCCT 


GCGCCGTCCG 


1620 


ATCATCCCAC 


AGGGCTGCTC 


TCACAACGCC 


CACATGTACT 


ACGTGTTACT 


AGCGCCCAGC 


1680 


GCCGATGGGG 


AGGAGGTGCT 


GGCGCGTCTG 


ACGAGCGAAG 


GTATAGGCGC 


GGTCTTTCAT 


1740 


TACGTGCCGC 


TTCACGATTC 


GCCGGCCGGG 


CGTCGCT 






1777 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

GAGATTGAAT CGTACCGGTC TCCTTAGCGG CTCCGTCCCG TGAATGCCCA TATCACGCAC 60 

GGCCATGTTC TGGCTGTCGA CCTTCGCCCC ATGCCCGGAC GTTGGTAAAC CCAGGGTTTG 120 

ATCAGTAATT CCGGGGGACG GTTGCGGGAA GGCGGCCAGG ATGTGCGTGA GCCGCGGCGC 180 

CGCCGTCGCC CAGGCGACCG CTGGATGCTC AGCCCCGGTG CGGCGACGTA GCCAGCGTTT 24 0 
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GGCGCGTGTC GTCCACAGTG GTACTCCGGT GACGACGCGG CGCGGTGCCT GGGTGAAGAC 
CGTGACCGAC GCCGCCGATT CAGA 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

GCGGTACCGC CGCGTTGCGC TGGCACGGGA CCTGTACGAC CTGAACCACT TCGCCTCGCG 60 

AACG AT T G AC GAACCGCTCG TGCGGCGGCT GTGGGTGCTC AAGGTGTGGG GTGATGTCGT 120 

CGATGACCGG CGCGGCACCC GGCCACTACG CGTCGAAGAC GTCCTCGCCG CCCGCAGCGA 180 

GCACGACTTC CAGCCCGACT CGATCGGCGT GCTGACCCGT CCTGTCGCTA TGGCTGCCTG 24 0 

GGAAGCTCGC GTTCGGAAGC GATTTGCGTT CCTCACTGAC CTCGACGCCG ACGAGCAGCG 300 

GTGGGCCGCC TGCGACGAAC GGCACCGCCG CGAAGTGGAG AACGCGCTGG CGGTGCTGCG 360 

GTCCTGATCA ACCTGCCGGC GATCGTGCCG TTCCGCTGGC ACGGTTGCGG CTGGACGCGG 420 

CTGAATCGAC TAGATGAGAG CAGTTGGGCA CGAATCCGGC TGTGGTGGTG AG C AAGAC AC 4 80 

GAGTACTGTC ATCACTATTG GATGCACTGG ATGACCGGCC TGATTCAGCA GGACCAATGG 54 0 

AACTGCCCGG GGCAAAACGT CTCGGAGATG ATCGGCGTCC CCTCGGAACC CTGCGGTGCT 600 

GGCGTCATTC GGACATCGGT CCGGCTCGCG GGATCGTGGT GACGCCAGCG CTGAAGGAGT 660 

GGAGCGCGGC GGTGCACGCG CTGCTGGACG GCCGGCAGAC GGTGCTGCTG CGTAAGGGCG 720 

GGATCGGCGA GAAGCGCTTC GAGGTGGCGG CCCACGAGTT CTTGTTGTTC CCGACGGTCG 7 80 

CGCACAGCCA CGCCGAGCGG GTTCGCCCCG AGCACCGCGA CCTGCTGGGC CCGGCGGCCG 840 

CCGACAGCAC CGACGAGTGT GTGCTACTGC GGGCCGCAGC GAAAGTTGTT GCCGCACTGC 900 

CGGTTAACCG GCCAGAGGGT CTGGACGCCA TCGAGGATCT GCACATCTGG ACCGCCGAGT 960 

CGGTGCGCGC CGACCGGCTC GACTTTCGGC CCAAGCACAA ACTGGCCGTC TTGGTGGTCT 1020 

CGGCGATCCC GCTGGCCGAG CCGGTCCGGC TGGCGCGTAG GCCCGAGTAC GGCGGTTGCA 1080 

CCAGCTGGGT GCAGCTGCCG GTGACGCCGA CGTTGGCGGC GCCGGTGCAC GACGAGGCCG 114 0 

CGCTGGCCGA GGTCGCCGCC CGGGTCCGCG AGGCCGTGGG TTGACTGGGC GGCATCGCTT 1200 

GGGTCTGAGC TGTACGCCCA GTCGGCGCTG CGAGTGATCT GCTGTCGGTT CGGTCCCTGC 12 60 

TGGCGTCAAT TGACGGCGCG GGCAACAGCA GCATTGGCGG CGCCATCCTC CGCGCGGCCG 1320 
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GCGCCCACCG CTACAACC 13 38 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 



CCGGCGGCAC 


CGGCGGCACC 


GGCGGTACCG 


GCGGCAACGG 


CGCTGACGCC 


GCTGCTGTGG 


60 


TGGGCTTCGG 


CGCGAACGGC 


GACCCTGGCT 


TCGCTGGCGG 


CAAAGGCGGT 


AACGGCGGAA 


120 


TAGGTGGGGC 


CGCGGTGACA 


GGCGGGGTCG 


CCGGCGACGG 


CGGCACCGGC 


GGCAAAGGTG 


180 


GCACCGGCGG 


TGCCGGCGGC 


GCCGGCAACG 


ACGCCGGCAG 


CACCGGCAAT 


CCCGGCGGTA 


240 


AGGGCGGCGA 


CGGCGGGATC 


GGCGGTGCCG 


GCGGGGCCGG 


CGGCGCGGCC 


GGCACCGGCA 


300 


ACGGCGGCCA 


TGCCGGCAAC 


C 








321 



(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

GAAGACCCGG CCCCGCCATA TCGATCGGCT CGCCGACTAC TTTCGCCGAA CGTGCACGCG 60 

GCGGCGTCGG GCTGATCATC ACCGGTGGCT ACGCGCCCAA CCGCACCGGA TGGCTGCTGC 120 

CGTTCGCCTC CGAACTCGTC ACTTCGGCGC AAGCCCGACG GCACCGCCGA ATCACCAGGG 18 0 

CGGTCCACGA TTCGGGTGCA AAGATCCTGC TGCAAATCCT GCACGCCGGA CGCTACGCCT 24 0 

ACCACCCACT TGCGGTCAGC GCCTCGCCGA TCAAGGCGCC GATCACCCCG TTTCGTCCGC 300 

GAGCACTATC GGCTCGCGGG GTCGAAGCGA CCATCGCGGA TTTCGCCCGC TGCGCGCAGT 3 60 

TGGCCCGCGA TGCCGGCTAC GACGGCGTCG AAATCATGGG CAGCGAAGGG TATCTGCTCA 4 20 

ATCAGTTCCT GGCGCCGCGC ACCAACAAGC GCACCGACTC GTGGGGCGGC ACACCGGCCA 4 80 

ACCGTCGCCG GT 4 92 
(2) INFORMATION FOR SEQ ID NO: 161: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

Phe Ala Gin His Leu Val Glu Gly Asp Ala Val Glu Leu Trp Arg Ala 
15 10 15 

Asn Ala Ala Asp Gin Ala Asp Pro Leu Gin Pro Gly Ser Ala Arg Arg 
20 25 30 

Gin Arg Ala Ser Arg Ser Pro Arg Arg Leu Ala Gly Pro Asn Ala Tyr 
35 40 45 

His Tyr Ser Asn Asn Arg Ser lie Leu Cys Gin Arg Trp Pro Leu Pro 
50 55 60 

Ser Ala Ala Gin Asp Val He Cys His Leu Cys Pro His Arg Gin Glu 
65 70 75 80 

Pro Gly Leu Met Thr Ala Phe Gly Val Glu Pro Tyr Gly Gin Pro Lys 
85 90 95 

Tyr Leu Glu He Ala Gly Lys Arg Met Ala Tyr He Asp Glu Gly Lys 
100 " 105 110 

Gly Asp Ala He Val Phe Gin His Gly Asn Pro Thr Ser Ser Tyr Leu 
115 120 125 

Trp Arg Asn lie Met Pro His Leu Glu Gly Leu Gly Arg Leu Val Ala 
130 135 140 

Cys Asp Leu lie Gly Met Gly Ala Ser Asp Lys Leu Ser Pro Ser Gly 
145 150 155 160 

Pro Asp Arg Tyr Ser Tyr Gly Glu Gin Arg Asp Phe Leu Phe Ala Leu 
165 " 170 175 

Trp Asp Ala Leu Asp Leu Gly Asp His Val Val Leu Val Leu His Asp 
180 185 190 

Trp Gly Ser Ala Leu Gly Phe Asp Trp Ala Asn Gin His Arg Asp Arg 
195 ' 200 2 05 

Val Gin Gly lie Ala Phe Met Glu Ala lie Val Thr Pro Met Thr Trp 
210 215 220 

Ala Asp Trp Pro Pro Ala Val Arg Gly Val Phe Gin Gly Phe Arg Ser 
225 J 230 235 240 

Pro Gin Gly Glu Pro Met Ala Leu Glu His Asn lie Phe Val Glu Arg 
245 250 255 

Val Leu Pro Gly Ala lie Leu Arg Gin Leu Ser Asp Glu Glu Met Asn 
260 265 270 
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His Tyr Arg Arg Pro Phe Val Asn Gly Gly Glu Asp Arg Arg Pro Thr 
275 280 285 

Leu Ser Trp Pro Arg Asn Leu Pro lie Asp Gly Glu Pro Ala Glu Val 
290 295 300 

Val Ala Leu Val Asn Glu Tyr Arg Ser Trp Leu Glu Glu Thr Asp Met 
305 310 315 320 

Pro Lys Leu Phe lie Asn Ala Glu Pro Gly Ala lie lie Thr Gly Arg 
325 330 335 

lie Arg Asp Tyr Val Arg Ser Trp Pro Asn Gin Thr Glu lie Thr Val 
340 345 350 

Pro Gly Val His Phe Val Gin Glu Asp Ser Asp Gly Val Val Ser Trp 
355 360 365 

Ala Gly Ala Arg Gin His Arg Arg Pro Gly Ser Ala Leu lie Ser Arg 
370 375 380 

Asp Gin Glu Cys Asp Phe Arg Arg Arg Arg Arg Pro Ala Cys Gin Leu 
385 390 395 400 

lie Arg Leu Pro Ala Pro Gly Arg Asp Ser Gin Gly Lys Gly His Gin 
405 410 " 415 

Ser Gin Pro Leu Pro Ser Gin Arg Gly Arg Gin lie Tyr Val Ala Gly 
420 425 430 

Gin Arg Ser Ser Tyr Leu Pro Ser Glu Leu Val Ala Ala Phe Leu Trp 
4 35 44 0 4 45 

Ala Gin Phe Glu Glu Ala Glu Arg lie Thr Arg lie Arg Leu Asp Leu 
4 50 4 55 4 60 

Trp Asn Arg Tyr His Glu Ser Phe Glu Ser Leu Glu Gin Arg Gly Leu 
465 470 475 480 

Leu Arg Arg Pro lie lie Pro Gin Gly Cys Ser His Asn Ala His Met 
485 490 495 

Tyr Tyr Val Leu Leu Ala Pro Ser Ala Asp Arg Glu Glu Val Leu Ala 
500 505 510 

Arg Leu Thr Ser Glu Gly lie Gly Ala Val Phe His Tyr Val Pro Leu 
515 520 525 

His Asp Ser Pro Ala Gly Arg Arg 
530 535 

(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 284 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

Asn Glu Ser Ala Pro Arg Ser Pro Met Leu Pro Ser Ala Arg Pro Arg 
1 5 10 15 

Tyr Asp Ala lie Ala Val Leu Leu Asn Glu Met His Ala Gly His Cys 
20 25 30 

Asp Phe Gly Leu Val Gly Pro Ala Pro Asp lie Val Thr Asp Ala Ala 
35 4 0 4 5 

Gly Asp Asp Arg Ala Gly Leu Gly Val Asp Glu Gin Phe Arg His Val 
50 55 60 

Gly Phe Leu Glu Pro Ala Pro Val Leu Val Asp Gin Arg Asp Asp Leu 
65 70 75 80 

Gly Gly Leu Thr Val Asp Trp Lys Val Ser Trp Pro Arg Gin Arg Gly 
85 90 95 

Ala Thr Val Leu Ala Ala Val His Glu Trp Pro Pro lie Val Val His 
100 105 110 

Phe Leu Val Ala Glu Leu Ser Gin Asp Arg Pro Gly Gin His Pro Phe 
115 120 125 

Asp Lys Asp Val Val Leu Gin Arg His Trp Leu Ala Leu Arg Arg Ser 
130 135 140 

Glu Thr Leu Glu His Thr Pro His Gly Arg Arg Pro Val Arg Pro Arg 
145 150 155 160 

His Arg Gly Asp Asp Arg Phe His Glu Arg Asp Pro Leu His Ser Val 
165 170 175 

Ala Met Leu Val Ser Pro Val Glu Ala Glu Arg Arg Ala Pro Val Val 
180 185 190 

Gin His Gin Tyr His Val Val Ala Glu Val Glu Arg lie Pro Glu Arg 
195 ~ 200 205 

Glu Gin Lys Val Ser Leu Leu Ala lie Ala lie Ala Val Gly Ser Arg 
210 215 220 

Trp Ala Glu Leu Val Arg Arg Ala His Pro Asp Gin lie Ala Gly His 
225 230 ~ 235 240 

Gin Pro Ala Gin Pro Phe Gin Val Arg His Asp Val Ala Pro Gin Val 
245 250 255 

Arg Arg Arg Gly Val Ala Val Leu Lys Asp Asp Gly Val Thr Leu Ala 
260 265 270 

Phe Val Asp lie Arg His Ala Leu Pro Gly Asp Phe 
275 280 

(2) INFORMATION FOR SEQ ID NO: 163 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 264 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

ATGAACATGT CGTCGGTGGT GGGTCGCAAG GCCTTTGCGC GATTCGCCGG CTACTCCTCC 60 

GCCATGCACG CGATCGCCGG TTTCTCCGAT GCGTTGCGCC AAGAGCTGCG GGGTAGCGGA 120 

ATCGCCGTCT CGGTGATCCA CCCGGCGCTG ACCCAGACAC CGCTGTTGGC CAACGTCGAC 180 

CCCGCCGACA TGCCGCCGCC GTTTCGCAGC CTCACGCCCA TTCCCGTTCA CTGGGTCGCG 24 0 

GCAGCGGTGC TTGACGGTGT GGCG 2 64 
(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 



TAGTCGGCGA 


CGATGACGTC 


GCGGTCCAGG 


CCGACCGCTT 


CAAGCACCAG 


CGCGACCACG 


60 


AAGCCGGTGC 


GATCCTTACC 


CGCGAAGCAG 


TGGGTGAGCA 


CCGGGCGTCC 


GGCGGCAAGC 


120 


AGTGTGACGA 


CACGATGTAG 


CGCGCGCTGT 


GCTCCATTGC 


GCGTTGGGAA 


TTGGCGATAC 


180 


TCGTCGGTCA 


TGTAGCGGGT 


GGCCGCGTCA 


TTTATCGACT 


GGCTGGATTC 


GCCGGACTCG 


240 


CCGTTGGACC 


CGTCATTGGT 


TAGCAGCCTC 


TTGAATGCGG 


TTTCGTGCGG 


CGCTGAGTCG 


300 


TCGGCGTCAT 


CATCGGCGAG 


GTCGGGGAAC 


GGCAGCAGGT 


GGACGTCGAT 


GCCGTCCGGA 


360 


ACCCGTCCTG 


GACCGCGGCG 


GGCAACCTCC 


CGGGACGACC 


GCAGGTCGGC 


AACGTCGGTG 


420 


ATCCCCAGCC 


GGCGCAGCGT 


TGCCCCTCGT 


GCCGAATTCG 


GCACGAGGCT 


GGCGAGCCAC 


480 


CGGGCATCAC 


CAAGCAACGC 


TTGCCCAGTA 


CGGATCGTCA 


CTTCCGCATC 


CGGCAGACCA 


540 


ATCTCCTCGC 


CGCCCATCGT 


CAGATCCCGC 


TCGTGCGTTG 


ACAAGAACGG 


CCGCAGATGT 


600 


GCCAGCGGGT 


ATCGGAGATT 


GAACCGCGCA 


CGCAGTTCTT 


CAATCGCTGC 


GCGCTGCCGC 


660 


ACTATTGGCA 


CTTTCCGGCG 


GTCGCGGTAT 


TCAGCAAGCA 


TGCGAGTCTC 


GACGAACTCG 


720 


CCCCACGTAA 


CCCACGGCGT 


AGCTCCCGGC 


GTGACGCGGA 


GGATCGGCGG 


GTGATCTTTG 


780 


CCGCCACGCT 


CGTAGCCGTT 


GATCCACCGC 


TTCGCGGTGC 


CGGCGGGGAG 


GCCGATCAGC 


840 
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TTATCGACCT CGGCGTATGC CGACGGCAAG CTGGGCGCGT TCGTCGAGGT CAAGAACTCC 900 

ACCATCGGCA CCGGCACCAA GGTGCCGCAC CTGACCTACG TCGGCGACGC CGACATCGGC 960 

GAGTACAGCA ACATCGGCGC CTCCAGCGTG TTCGTCAACT ACGACGGTAC GTCCAAACGG 1020 

CGCACCACCG TCGGTTCGCA CGTACGGACC GGGTCCGACA CCATGTTCGT GGCCCCAGTA 108 0 

ACCATCGGCG ACGGCGCGTA TACCGGGGCC GGCACAGTGG TGCGGGAGGA TGTCCCGCCG 114 0 

GGGGCGCTGG CAGTGTCGGC GGGTCCGCAA C H 7 * 
(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 
GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 
ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGCGCCGGC GGCACCAGCT 
TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 
GCGGAAACGG CGGAAACGGC G C AG AC AAC A CCACCACCGC CGCCGCC 
(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

CCTCGCCACC ATGGGCGGGC AGGGCGGTAG CGGTGGCGCC GGCTCTACCC CAGGCGCCAA 60 

GGGCGCCCAC GGCTTCACTC CAACCAGCGG CGGCGACGGC GGCGACGGCG GCAACGGCGG 120 

CAACTCCCAA GTGGTCGGCG GCAACGGCGG CGACGGCGGC AATGGCGGCA ACGGCGGCAG 180 

CGCCGGCACG GGCGGCAACG GCGGCCGCGG CGGCGACGGC GCGTTTGGTG GCATGAGTGC 24 0 

CAACGCCACC AACCCTGGTG AAAACGGGCC AAACGGTAAC CCCGGCGGCA ACGGTGGCGC 300 
CGGC 

(2) INFORMATION FOR SEQ ID NO: 167: 



304 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



GTGGGACGCT 


GCCGAGGCTG 


TATAACAAGG 


ACAACATCGA 


CCAGCGCCGG 


CTCGGTGAGC 


60 


TGATCGACCT 


ATTTAACAGT 


GCGCGCTTCA 


GCCGGCAGGG 


CGAGCACCGC 


GCCCGGGATC 


120 


TGATGGGTGA 


GGTCTACGAA 


TACTTCCTCG 


GCAATTTCGC 


TCGCGCGGAA 


GGGAAGCGGG 


180 


GTGGCGAGTT 


CTTTACCCCG 


CCCAGCGTGG 


TCAAGGTGAT 


CGTGGAGGTG 


CTGGAGCCGT 


240 


CGAGTGGGCG 


GGTGTATGAC 


CCGTGCTGCG 


GTTCCGGAGG 


CATGTTTGTG 


CAGACCGAGA 


300 


AGTTCATCTA 


CGAACACGAC 


GGCGATCCGA 


AGGATGTCTC 


GATCTATGGC 


CAGGAAAGCA 


360 


TTGAGGAGAC 


CTGGCGGATG 


GCGAAGATGA 


ACCTCGCCAT 


CCACGGCATC 


GACAACAAGG 


420 


GGCTCGGCGC 


CCGATGGAGT 


GATACCTTCG 


CCCGCGACCA 


GCACCCGGAC 


GTGCAGATGG 


480 


ACTACGTGAT 


GGCCAATCCG 


CCGTTCAACA 


TCAAAGACTG 


GGCCCGCAAC 


GAGGAAGACC 


540 


CACGCTGGCG 


CTTCGGTGTT 


CCGCCCGCCA 


ATAACGCCAA 


CTACGCATGG 


ATTCAGCACA 


600 


TCCTGTACAA 


CTTGGCGCCG 


GGAGGTCGGG 


CGGGCGTGGT 


GATGGCCAAC 


GGGTCGATGT 


660 


CGTCGAACTC 


CAACGGCAAG 


GGGGATATTC 


GCGCGCAAAT 


CGTGGAGGCG 


GATTTGGTTT 


720 


CCTGCATGGT 


CGCGTTACCC 


ACCCAGCTGT 


TCCGCAGCAC 


CGGAATCCCG 


GTGTGCCTGT 


780 


GGTTTTTCGC 


CAAAAACAAG 


GCGGCAGGTA 


AGCAAGGGTC 


TATCAACCGG 


TGCGGGCAGG 


840 


TGCTGTTCAT 


CGACGCTCGT 


GAACTGGGCG 


ACCTAGTGGA 


CCGGGCCGAG 


CGGGCGCTGA 


900 


CCAACGAGGA 


GATCGTCCGC 


ATCGGGGATA 


CCTTCCACGC 


GAGCACGACC 


ACCGGCAACG 


960 


CCGGCTCCGG 


TGGTGCCGGC 


GGTAATGGGG 


GCACTGGCCT 


CAACGGCGCG 


GGCGGTGCTG 


1020 


GCGGGGCCGG 


CGGCAACGCG 


GGTGTCGCCG 


GCGTGTCCTT 


CGGCAACGCT 


GTGGGCGGCG 


1080 


ACGGCGGCAA 


CGGCGGCAAC 


GGCGGCCACG 


GCGGCGACGG 


CACGACGGGC 


GGCGCCGGCG 


1140 


GCAAGGGCGG 


CAACGGCAGC 


AGCGGTGCCG 


CCAGCGGCTC 


AGGCGTCGTC 


AACGTCACCG 


1200 


CCGGCCACGG 


CGGCAACGGC 


GGCAATGGCG 


GCAACGGCGG 


CAACGGCTCC 


GCGGGCGCCG 


1260 


GCGGCCAGGG 


CGGTGCCGGC 


GGCAGCGCCG 


GCAACGGCGG 


CCACGGCGGC 


GGTGCCACCG 


1320 


GCGGCGCCAG 


CGGCAAGGGC 


GGCAACGGCA 


CCAGCGGTGC 


CGCCAGCGGC 


TCAGGCGTCA 


1380 


TCAACGTCAC 


CGCCGGCCAC 


GGCGGCAACG 


GCGGCAATGG 


CCGCAACGGC 


GGCAACGGC 


1439 



(2) INFORMATION FOR SEQ ID NO: 168: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 



GGGCCGGCGG 


GGCCGGATTT 


TCTCGTGCCT 


TGATTGTCGC 


TGGGGATAAC 


GGCGGTGATG 


60 


GTGGTAACGG 


CGGGATGGGC 


GGGGCTGGCG 


GGGCTGGCGG 


CCCCGGCGGG 


GCCGGCGGCC 


120 


TGATCAGCCT 


GCTGGGCGGC 


CAAGGCGCCG 


GCGGGGCCGG 


CGGGACCGGC 


GGGGCCGGCG 


180 


GTGTTGGCGG 


TGACGGCGGG 


GCCGGCGGCC 


CCGGCAACCA 


GGCCTTCAAC 


GCAGGTGCCG 


240 


GCGGGGCCGG 


CGGCCTGATC 


AGCCTGCTGG 


GCGGCCAAGG 


CGCCGGCGGG 


GCCGGCGGGA 


300 


CCGGCGGGGC 


CGGCGGTGTT 


GGCGGTGAC 








329 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

GCAACGGTGG CAACGGCGGC ACCAGCACGA CCGTGGGGAT GGCCGGAGGT AACTGTGGTG 60 

CCGCCGGGCT GATCGGCAAC 8 0 
(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

GGGCTGTGTC GCACTCACAC CGCCGCATTC GGCGACGTTG GCCGCCCAAT ATCCAGCTCA 60 

AGGCCTACTA CTTACCGTCG GAGGACCGCC GCATCAAGGT GCGGGTCAGC GCCCAAGGAA 120 

TCAAGGTCAT CGACCGCGAC GGGCATCGAG GCCGTCGTCG CGCGGCTCGG GCAGGATCCG 180 
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CCCCGGCGCA CTTCGCGCGC CAAGCGGGCT CATCGCTCCG AACGGCGGCG ATCCTGTGAG 240 

CACAACTGAT GGCGCGCAAC GAGATTCGTC CAATTGTCAA GCCGTGTTCG ACCGCAGGGA 300 

CCGGTTATAC GTATGTCAAC CTATGTCACT CGCAAGAACC GGCATAACGA TCCCGTGATC 3 60 

CGCCGACAGC CCACGAGTGC AAGACCGTTA CA 3 92 
(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 



ACCGGCGCCA 


CCGGCGGCAC 


CGGGTTCGCC 


GGTGGCGCCG 


GCGGGGCCGG 


CGGGCAGGGC 


60 


GGTATCAGCG 


GTGCCGGCGG 


CACCAACGGC 


TCTGGTGGCG 


CTGGCGGCAC 


CGGCGGACAA 


120 


GGCGGCGCCG 


GGGGCGCTGG 


CGGGGCCGGC 


GCCGATAACC 


CCACCGGCAT 


CGGCGGCGCC 


180 


GGCGGCACCG 


GCGGCACCGG 


CGGAGCGGCC 


GGAGCCGGCG 


GGGCCGGTGG 


CGCCATCGGT 


240 


ACCGGCGGCA 


CCGGCGGCGC 


GGTGGGCAGC 


GTCGGTAACG 


CCGGGATCGG 


CGGTACCGGC 


300 


GGTACGGGTG 


GTGTCGGTGG 


TGCTGGTGGT 


GCAGGTGCGG 


CTGCGGCCGC 


TGGCAGCAGC 


360 


GCTACCGGTG 


GCGCCGGGTT 


CGCCGGCGGC 


GCCGGCGGAG 


AAGGCGGACC 


GGGCGGCAAC 


420 


AGCGGTGTGG 


GCGGCACCAA 


CGGCTCCGGC 


GGCGCCGGCG 


GTGCAGGCGG 


CAAGGGCGGC 


480 


ACCGGAGGTG 


CCGGCGGGTC 


CGGCGCGGAC 


AACCCCACCG 


GTGCTGGTTT 


CGCCG 


535 



(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172:- 

CCGACGTCGC CGGGGCGATA CGGGGGTCAC CGACTACTAC ATCATCCGCA CCGAGAATCG 60 

GCCGCTGCTG CAACCGCTGC GGGCGGTGCC GGTCATCGGA GATCCGCTGG CCGACCTGAT 120 

CCAGCCGAAC CTGAAGGTGA TCGTCAACCT GGGCTACGGC GACCCGAACT ACGGCTACTC 180 

GACGAGCTAC GCCGATGTGC GAACGCCGTT CGGGCTGTGG CCGAACGTGC CGCCTCAGGT 24 0 
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CATCGCCGAT 


GCCCTGGCCG 


CCGGAACACA 


AGAAGGCATC 


CTTGACTTCA 




.3 U w 


GCAGGCGCTG 


TCCGCGCAAC 


CGCTCACGCT 


CCCGCAGATC 


CAGC I GCCCnC 




^ £0 

jOU 


TCTGGTGGCC 


GCGGTGGCCG 


CCGCACCGAC 


GCCGGCCbAtj 




p r: r* t r* n r* c zx fi 


4 20 


GATCATCTCA 


ACCAACTACG 


CCGTCCTGCT 


GCCCACCGTG 


GACATCGCCC 


TCGCCTGGTC 


480 


ACCACCCTGC 


CGCTGTACAC 


CACCCAACTG 


TTCGTCAGGC 


AACTCGCTGC 


GGGCAATCTG 


540 


ATCAACGCGA 


TCGGCTATCC 


CCTGGCGGCC 


ACCGTAGGTT 


TAGGCACGAT 


CGATAGCGGG 


600 


CGGCGTGGAA 


TTGCTCACCC 


TCCTCGCGGC 


GGCCTCGGAC 


ACCGTTCGAA 


ACATCGAGGG 


660 


CCTCGTCACC 


TAACGGATTC 


CCGACGGCAT 








690 



(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 407 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 3 : 



ACGGTGACGG 


CGGTACTGGC 


GGCGGCCACG 


GCGGCAACGG 


CGGGAATCCC 


GGGTGGCTCT 


60 


TGGGCACAGC 


CGGGGGTGGC 


GGCAACGGTG 


GCGCCGGCAG 


CACCGGTACT 


GCAGGTGGCG 


120 


GCTCTGGGGG 


CACCGGCGGC 


GACGGCGGGA 


CCGGCGGGCG 


TGGCGGCCTG 


TTAATGGGCG 


180 


CCGGCGCCGG 


CGGGCACGGT 


GGCACTGGCG 


GCGCGGGCGG 


TGCCGGTGTC 


GACGGTGGCG 


240 


GCGCCGGCGG 


GGCCGGCGGG 


GCCGGCGGCA 


ACGGCGGCGC 


CGGGGGTCAA 


GCCGCCCTGC 


300 


TGTTCGGGCG 


CGGCGGCACC 


GGCGGAGCCG 


GCGGCTACGG 


CGGCGATGGC 


GGTGGCGGCG 


360 


GTGACGGCTT 


CGACGGCACG 


ATGGCCGGCC 


TGGGTGGTAC 


CGGTGGC 




407 



(2) INFORMATION FOR SEQ ID NO: 17 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
GATCGGTCAG CGCATCGCCC TCGGCGGCAA GCGATTCCGC GGTCTCACCG AAGAACATCG 60 
TGCACGCGGC GGCGCGGACC AGCCCGCTGC GCTGCGGCGC GTCGAACGCC TCCAGCAGGC 120 



BNSDOCID: <WO 9816646A2_IA> 



WO 98/16646 



PCT/US97/18293 



173 



ACAGCCAGTC 


CTTGGCGGCC 


TGCGAGGCGA 


ACACGTCGGT 


GTCACCGGTG 


TAGATCGCCG 


180 


GGATGCCCGC 


CTCCGCCAAC 


GCATTCCGGC 


ACGCCCGCGC 


GTCTTTGTGA 


TGCTCGACGA 


240 


TCACCGCGAT 


GTCTGCGGCC 


ACCACGGGCC 


GCCCGGCGAA 


GGTGGCCCCG 


CTGGCCAGTA 


300 


GCGCCGCGAC 


GTCGGCGGCC 


AGGTCGTCGG 


GGATGTGCCG 


GCGCAGCGCT 


CCGGCGCGAC 


360 


GCCCGAAAAA 


CGACCCCTCA 


CCCAGCTGGG 


TCCCGCTGGC 


ATATCCCTTG 


CCGTCCTGGG 


420 


CGATATTGGA 


CGCGCATGCC 


CCGACCGCGT 


ACAGGCCGGC 


CACCACCG 




4 68 



(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

GGTGGTAACG GCGGCCAGGG TGGCATCGGC GGCGCCGGCG AGAGAGGCGC CGACGGCGCC 60 

GGCCCCAATG CTAACGGCGC AAACGGCGAG AACGGCGGTA GCGGTGGTAA CGGTGGCGAC 120 

GGCGGCGCCG GCGGCAATGG CGGCGCGGGC GGCAACGCGC AGGCGGCCGG GTACACCGAC 18 0 

GGCGCCACGG GCACCGGCGG CGACGGCGGC AACGGCGGC 219 
(2) INFORMATION FOR SEQ ID NO: 17 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 6: 



TAGCTCCGGC 


GAGGGCGGCA 


AGGGCGGCGA 


CGGTGGCCAC 


GGCGGTGACG 


GCGTCGGCGG 


60 


CAACAGTTCC 


GTCACCCAAG 


GCGGCAGCGG 


CGGTGGCGGC 


GGCGCCGGCG 


GCGCCGGCGG 


120 


CAGCGGCTTT 


TTCGGCGGCA 


AGGGCGGCTT 


CGGCGGCGAC 


GGCGGTCAGG 


GCGGCCCCAA 


180 


CGGCGGCGGT 


ACCGTCGGCA 


CCGTGGCCGG 


TGGCGGCGGC 


AACGGCGGTG 


TCGGCGGCCG 


240 


GGGCGGCGAC 


GGCGTCTTTG 


CCGGTGCCGG 


CGGCCAGGGC 


GGCCTCGGTG 


GGCAGGGCGG 


300 


CAATGGCGGC 


GGCTCCACCG 


GCGGCAACGG 


CGGCCTTGGC 


GGCGCGGGCG 


GTGGCGGAGG 


360 


CAACGCCCCG 


GCTCGTGCCG 


AATCCGGGCT 


GACCATGGAC 


AGCGCGGCCA 


AGTTCGCTGC 


420 
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CATCGCATCA GGCGCGTACT GCCCCGAACA CCTGGAACAT CACCCGAGTT AGCGGGGCGC 4 80 

ATTTCCTGAT CACC 4 94 

(2) I N FORMAT I ON FOR SEQ ID NO: 17 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 60 

TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 120 

CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 180 

GCCAGAGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC 220 
(2) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 388 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 



ATGGCGGCAA 


CGGGGGCCCC 


GGCGGTGCTG 


GCGGGGCCGG 


CGACTACAAT 


TTCCAACGGC 


60 


GGGCAGGGTG 


GTGCCGGCGG 


CCAAGGCGGC 


CAAGGCGGCC 


TGGGCGGGGC 


AAGCACCACC 


120 


TGATCGGCCT 


AGCCGCACCC 


GGGAAAGCCG 


ATCCAACAGG 


CGACGATGCC 


GCCTTCCTTG 


180 


CCGCGTTGGA 


CCAGGCCGGC 


ATCACCTACG 


CTGACCCAGG 


CCACGCCATA 


ACGGCCGCCA 


240 


AGGCGATGTG 


TGGGCTGTGT 


GCTAACGGCG 


TAACAGGTCT 


ACAGCTGGTC 


GCGGACCTGC 


300 


GGGACTACAA 


TCCCGGGCTG 


ACCATGGACA 


GCGCGGCCAA 


GTTCGCTGCC 


AT C G CAT C AG 


360 


GCGCGTACTG 


CCCCGAACAC 


CTGGAACA 








388 



(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 



GCAAAGGCGG 


CACCGGCGGG 


GCCGGCATGA 


ACAGCCTCGA 


CCCGCTGCTA 


GCCGCCCAAG 


60 


ACGGCGGCCA 


AGGCGGCACC 


GGCGGCACCG 


GCGGCAACGC 


CGGCGCCGGC 


GGCACCAGCT 


120 


TCACCCAAGG 


CGCCGACGGC 


AACGCCGGCA 


ACGGCGGTGA 


CGGCGGGGTC 


GGCGGCAACG 


180 


GCGGAAACGG 


CGGAAACGGC 


GCAGACAACA 


CCACCACCGC 


CGCCGCCGGC 


ACCACAGGCG 


240 


GCGACGGCGG 


GGCCGGCGGG 


GCCGGCGGAA 


CCGGCGGAAC 


CGGCGGAGCC 


GCCGGCACCG 


300 


GCACCGGCGG 


CCAACAAGGC 


AACGGCGGCA 


ACGGCGGCAC 


CGGCGGCAAA 


GGCGGCACCG 


360 


GCGGCGACGG 


TGCACTCTCA 


GGCAGCACCG 


GTGGTGCCGG 






400 



(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 538 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 



GGCAACGGCG 


GCAACGGCGG 


CATCGCCGGC 


ATTGGGCGGC 


AACGGCGTTC 


CGGGACGGGC 


60 


AGCGGCAACG 


GCGGCCAACG 


GCGGCAGCGG 


CGGCAACGGC 


GGCAACGCCG 


GCATGGGCGG 


120 


CAACAGCGGC 


ACCGGCAGCG 


GCGACGGCGG 


TGCCGGCGGG 


AACGGCGGCG 


CGGCGGGCAC 


180 


GGGCGGCACC 


GGCGGCGACG 


GCGGCCTCAC 


CGGTACTGGC 


GGCACCGGCG 


GCAGCGGTGG 


240 


CACCGGCGGT 


GACGGCGGTA 


ACGGCGGCAA 


CGGAGCAGAT 


AACACCGCAA 


AC AT G AC T G C 


300 


GCAGGCGGGC 


GGTGACGGTG 


GCAACGGCGG 


CGACGGTGGC 


TTCGGCGGCG 


GGGCCGGGGC 


360 


CGGCGGCGGT 


GGCTTGACCG 


CTGGCGCCAA 


CGGCACCGGC 


GGGCAAGGCG 


GCGCCGGCGG 


420 


CGATGGCGGC 


AACGGGGCCA 


TCGGCGGCCA 


CGGCCCACTC 


ACTGACGACC 


CCGGCGGCAA 


480 


CGGGGGCACC 


GGCGGCAACG 


GCGGCACCGG 


CGGCACCGGC 


GGCGCGGGCA 


TCGGCAGC 


538 



(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



BNSDOCID: <WO 9816646A2JA> 



PCT/US97/18293 

WO 98/16646 

176 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 
TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 
CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 
GCCACGGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC CGGTGGTGCC GGCGGCACC 
(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 985 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 



AGCAGCGCTA 


CCGGTGGCGC 


CGGGTTCGCC 


GGCGGCGCCG 


GCGGAGAAGG 


CGGAGCGGGC 


60 


GGCAACAGCG 


GTGTGGGCGG 


CACCAACGGC 


TCCGGCGGCG 


CCGGCGGTGC 


AGGCGGCAAG 


120 


GGCGGCACCG 


GAGGTGCCGG 


CGGGTCCGGC 


GCGGACAACC 


CCACCGGTGC 


TGGTTTCGCC 


180 


GGTGGCGCCG 


GCGGCACAGG 


TGGCGCGGCC 


GGGGCCGGCG 


GGGCCGGCGG 


GGCGACCGGT 


240 


ACCGGCGGCA 


CCGGCGGCGT 


TGTCGGCGCC 


ACCGGTAGTG 


CAGGCATCGG 


CGGGGCCGGC 


300 


GGCCGCGGCG 


GTGACGGCGG 


CGATGGGGCC 


AGCGGTCTCG 


GCCTGGGCCT 


CTCCGGCTTT 


360 


GACGGCGGCC 


AAGGCGGCCA 


AGGCGGGGCC 


GGCGGCAGCG 


CCGGCGCCGG 


CGGCATCAAC 


420 


GGGGCCGGCG 


GGGCCGGCGG 


CAACGGCGGC 


GACGGCGGGG 


ACGGCGCAAC 


CGGTGCCGCA 


480 


GGTCTCGGCG 


ACAACGGCGG 


GGTCGGCGGT 


GACGGTGGGG 


CCGGTGGCGC 


CGCCGGCAAC 


540 


GGCGGCAACG 


CGGGCGTCGG 


CCTGACAGCC 


AAGGCCGGCG 


ACGGCGGCGC 


CGCGGGCAAT 


600 


GGCGGCAACG 


GGGGCGCCGG 


CGGTGCTGGC 


GGGGCCGGCG 


ACAACAATTT 


CAACGGCGGC 


660 


CAGGGTGGTG 


CCGGCGGCCA 


AGGCGGCCAA 


GGCGGCTTGG 


GCGGGGCAAG 


CACCACCTGA 


720 


TCGGCCTAGC 


CGCACCCGGG 


AAAGCCGATC 


CAACAGGCGA 


CGATGCCGCC 


TTCCTTGCCG 


780 


CGTTGGACCA 


GGCCGGCATC 


ACCTACGCTG 


ACCCAGGCCA 


CGCCATAACG 


GCCGCCAAGG 


840 


CGATGTGTGG 


GCTGTGTGCT 


AACGGCGTAA 


CAGGTCTACA 


GCTGGTCGCG 


GACCTGCGGG 


900 


AATACAATCC 


CGGGCTGACC 


AT GG AC AG C G 


CGGCCAAGTT 


CGCTGCCATC 


GCATCAGGCG 


960 


CGTACTGCCC 


CGAACACCTG 


GAACA 








985 


(2) INFORMATION FOR SEQ ID NO: 183: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2138 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 



CGGCACGAGG 


ATCGGTACCC 


CGCGGCATCG 


GCAGCTGCCG 


ATTCGCCGGG 


TTTCCCCACC 


60 


CGAGGAAAGC 


CGCTACCAGA 


TGGCGCTGCC 


GAAGTAGGGC 


GATCCGTTCG 


CGATGCCGGC 


120 


ATGAACGGGC 


GGCATCAAAT 


TAGTGCAGGA 


ACCTTTCAGT 


TTAGCGACGA 


TAATGGCTAT 


180 


AGCACTAAGG 


AGGATGATCC 


GATATGACGC 


AGTCGCAGAC 


CGTGACGGTG 


GATCAGCAAG 


240 


AGATTTTGAA 


CAGGGCCAAC 


GAGGTGGAGG 


CCCCGATGGC 


GGACCCACCG 


ACTGATGTCC 


300 


CCATCACACC 


GTGCGAACTC 


ACGGCGGCTA 


AAAACGCCGC 


CCAACAGCTG 


GTATTGTCCG 


360 


CCGACAACAT 


GCGGGAATAC 


CTGGCGGCCG 


GTGCCAAAGA 


GCGGCAGCGT 


CTGGCGACCT 


420 


CGCTGCGCAA 


CGCGGCCAAG 


GCGTATGGCG 


AGGTTGATGA 


GGAGGCTGCG 


ACCGCGCTGG 


480 


ACAACGACGG 


CGAAGGAACT 


GTGCAGGCAG 


AATCGGCCGG 


GGCCGTCGGA 


GGGGACAGTT 


540 


CGGCCGAACT 


AACCGATACG 


CCGAGGGTGG 


CCACGGCCGG 


TGAACCCAAC 


TTCATGGATC 


600 


TCAAAGAAGC 


GGCAAGGAAG 


CTCGAAACGG 


GCGACCAAGG 


CGCATCGCTC 


GCGCACTTTG 


660 


CGGATGGGTG 


GAACACTTTC 


AACCTGACGC 


TGCAAGGCGA 


CGTCAAGCGG 


TTCCGGGGGT 


720 


TTGACAACTG 


GGAAGGCGAT 


GCGGCTACCG 


CTTGCGAGGC 


TTCGCTCGAT 


CAACAACGGC 


780 


AATGGATACT 


CCACATGGCC 


AAATTGAGCG 


CTGCGATGGC 


CAAGCAGGCT 


CAATATGTCG 


840 


CGCAGCTGCA 


CGTGTGGGCT 


AGGCGGGAAC 


ATCCGACTTA 


TGAAGACATA 


GTCGGGCTCG 


900 


AACGGCTTTA 


CGCGGAAAAC 


CCTTCGGCCC 


GCGACCAAAT 


TCTCCCGGTG 


TACGCGGAGT 


960 


ATCAGCAGAG 


GTCGGAGAAG 


GTGCTGACCG 


AATACAACAA 


CAAGGCAGCC 


CTGGAACCGG 


1020 


TAAACCCGCC 


GAAGCCTCCC 


CCCGCCATCA 


AGATCGACCC 


GCCCCCGCCT 


CCGCAAGAGC 


1080 


AGGGATTGAT 


CCCTGGCTTC 


CTGATGCCGC 


CGTCTGACGG 


CTCCGGTGTG 


ACTCCCGGTA 


1140 


CCGGGATGCC 


AGCCGCACCG 


ATGGTTCCGC 


CTACCGGATC 


GCCGGGTGGT 


GGCCTCCCGG 


1200 


CTGACACGGC 


GGCGCAGCTG 


ACGTCGGCTG 


GGCGGGAAGC 


CGCAGCGCTG 


TCGGGCGACG 


1260 


TGGCGGTCAA 


AGCGGCATCG 


CTCGGTGGCG 


GTGGAGGCGG 


CGGGGTGCCG 


TCGGCGCCGT 


1320 


TGGGATCCGC 


GATCGGGGGC 


GCCGAATCGG 


TGCGGCCCGC 


TGGCGCTGGT 


GACATTGCCG 


1380 


GCTTAGGCCA 


GGGAAGGGCC 


GGCGGCGGCG 


CCGCGCTGGG 


CGGCGGTGGC 


ATGGGAATGC 


1440 


CGATGGGTGC 


CGCGCATCAG 


GGACAAGGGG 


GCGCCAAGTC 


CAAGGGTTCT 


CAGCAGGAAG 


1500 
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ACGAGGCGCT 


CTACACCGAG 


GATCGGGCAT 


GGACCGAGGC 


CGTCATTGGT 


AACCGTCGGC 


1560 


GCCAGGACAG 


TAAGGAGTCG 


AAGTGAGCAT 


GGACGAATTG 


GACCCGCATG 


TCGCCCGGGC 


1620 


GTTGACGCTG 


GCGGCGCGGT 


TTCAGTCGGC 


CCTAGACGGG 


ACGCTCAATC 


AGATGAACAA 


1680 


CGGATCCTTC 


CGCGCCACCG 


ACGAAGCCGA 


GACCGTCGAA 


GTGACGATCA 


ATGGGCACCA 


1740 


GTGGCTCACC 


GGCCTGCGCA 


TCGAAGATGG 


TTTGCTGAAG 


AAGCTGGGTG 


CCGAGGCGGT 


1800 


GGCTCAGCGG 


GTCAACGAGG 


CGCTGCACAA 


TGCGCAGGCC 


GCGGCGTCCG 


CGTATAACGA 


1860 


CGCGGCGGGC 


GAG C AG C T G A 


CCGCTGCGTT 


ATCGGCCATG 


TCCCGCGCGA 


TGAACGAAGG 


1920 


AATGGCCTAA 


GCCCATTGTT 


GCGGTGGTAG 


CGACTACGCA 


CCGAATGAGC 


GCCGCAATGC 


1980 


GGTCATTCAG 


CGCGCCCGAC 


ACGGCGTGAG 


TACGCATTGT 


CAATGTTTTG 


ACATGGATCG 


2040 


GCCGGGTTCG 


GAGGGCGCCA 


TAGTCCTGGT 


CGCCAATATT 


GCCGCAGCTA 


GCTGGTCTTA 


2100 


GGTTCGGTTA 


CGCTGGTTAA 


TTATGACGTC 


CGTTACCA 






2138 



(2) INFORMATION FOR SEQ ID NO: 184 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 60 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu lie Leu Asn 
15 10 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp Val 
20 25 30 

Pro lie Thr Pro Cys Glu Leu Thr Ala Ala Lys Asn Ala Ala Gin Gin 
35 ~ 40 45 

Leu Val Leu Ser Ala Asp Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala 
50 55 60 

Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lys Ala 
65 70 75 80 

Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
85 90 95 

Glu Gly Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 
100 105 110 

Ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 

Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 
130 135 140 
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Gin Gly Ala Ser Leu Ala His Phs Ala Asp Gly Trp Asn Thr Phe Asn 
145 150 155 160 

Leu Thr Leu Gin Gly Asp Val Lys Arg Phe Arg Gly Phe Asp Asn Trp 
165 170 175 

Glu Gly Asp Ala Ala Thr Ala Cys Glu Ala Ser Leu Asp Gin Gin Arg 
180 185 " 190 

Gin Trp lie Leu His Met Ala Lys Leu Ser Ala Ala Met Ala Lys Gin 
195 200 205 

Ala Gin Tyr Val Ala Gin Leu His Val Trp Ala Arg Arg Glu His Pro 
210 215 220 

Thr Tyr Glu Asp lie Val Gly Leu Glu Arg Leu Tyr Ala Glu Asn Pro 
225 230 235 240 

Ser Ala Arg Asp Gin lie Leu Pro Val Tyr Ala Glu Tyr Gin Gin Arg 
24 5 250 " 255 

Ser Glu Lys Val Leu Thr Glu Tyr Asn Asn Lys Ala Ala Leu Glu Pro 
260 265 270 

Val Asn Pro Pro Lys Pro Pro Pro Ala lie Lys lie Asp Pro Pro Pro 
275 280 285 

Pro Pro Gin Glu Gin Gly Leu lie Pro Gly Phe Leu Met Pro Pro Ser 
290 295 300 

Asp Gly Ser Gly Val Thr Pro Gly Thr Gly Met Pro Ala Ala Pro Met 
305 " 310 315 320 

Val Pro Pro Thr Gly Ser Pro Gly Gly Gly Leu Pro Ala Asp Thr Ala 
325 330 335 

Ala Gin Leu Thr Ser Ala Gly Arg Glu Ala Ala Ala Leu Ser Gly Asp 
34 0 34 5 350 

Val Ala Val Lys Ala Ala Ser Leu Gly Gly Gly Gly Gly Gly Gly Val 
355 " 360 365 

Pro Ser Ala Pro Leu Gly Ser Ala lie Gly Gly Ala Glu Ser Val Arg 
370 375 380 

Pro Ala Gly Ala Gly Asp lie Ala Gly Leu Gly Gin Gly Arg Ala Gly 
385 390 395 " 400 

Gly Gly Ala Ala Leu Gly Gly Gly Gly Met Gly Met Pro Met Gly Ala 
405 410 415 

Ala His Gin Gly Gin Gly Gly Ala Lys Ser Lys Gly Ser Gin Gin Glu 
420 425 430 

Asp Glu Ala Leu Tyr Thr Glu Asp Arg Ala Trp Thr Glu Ala Val lie 
435 " 440 445 

Gly Asn Arg Arg Arg Gin Asp Ser Lys Glu Ser Lys 
450 455 460 

(2) INFORMATION FOR SEQ ID NO: 185: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
'(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

Ala Gly Asn Val Thr Ser Ala Ser Gly Pro His Arg Phe Gly Ala Pro 
1 5 10 15 

Asp Arg Gly Ser Gin Arg Arg Arg Arg His Pro Ala Ala Ser Thr Ala 
20 " 25 30 

Thr Glu Arg Cys Arg Phe Asp Arg His Val Ala Arg Gin Arg Cys Gly 
35 40 45 

Phe Pro Pro Ser Arg Arg Gin Leu Arg Arg Arg Val Ser Arg Glu Ala 
50 55 60 

Thr Thr Arg Arg Ser Gly Arg Arg Asn His Arg Cys Gly Trp His Pro 
65 70 75 80 

Glv Thr Gly Ser His Thr Gly Ala Val Arg Arg Arg His Gin Glu Ala 
85 90 95 

Arq Asp Gin Ser Leu Leu Leu Arg Arg Arg Gly Arg Val Asp Leu Asp 
100 105 HO 

Glv Gly Gly Arg Leu Arg Arg Val Tyr Arg Phe Gin Gly Cys Leu Val 
115 120 125 

Val Val Phe Gly Gin His Leu Leu Arg Pro Leu Leu He Leu Arg Val 
130 " 135 140 

His Arg Glu Asn Leu Val Ala Gly Arg Arg Val Phe Arg Val Lys Pro 
145 150 155 160 

Phe Glu Pro Asp Tyr Val Phe He Ser Arg Met Phe Pro Pro Ser Pro 
165 170 175 

His Val Gin Leu Arg Asp He Leu Ser Leu Leu Gly His Arg Ser Ala 
180 185 190 

Gin Phe Gly His Val Glu Tyr Pro Leu Pro Leu Leu He Glu Arg Ser 
195 200 205 

Leu Ala Ser Gly Ser Arg He Ala Phe Pro Val Val Lys Pro Pro Glu 
210 215 220 

Pro Leu Asp Val Ala Leu Gin Arg Gin Val Glu Ser Val Pro Pro He 
225 230 235 240 

Arq Lys Val Arg Glu Arg Cys Ala Leu Val Ala Arg Phe Glu Leu Pro 
245 250 255 

Cys Arg Phe Phe Glu He His Glu Val Gly Phe Thr Gly Arg Gly His 
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260 265 270 

Pro Arg Arg lie Gly 
275 

(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Arg Val Ala Ala Ser Phe lie Asp Trp Leu Asp Ser Pro Asp Ser Pro 

1 '5 10 15 

Leu Asp Pro Ser Leu Val Ser Ser Leu Leu Asn Ala Val Ser Cys Gly 
20 25 30 

Ala Glu Ser Ser Ala Ser Ser Ser Ala Arg Ser Gly Asn Gly Ser Arg 
35 4 0 " 4 5 

Trp Thr Ser Met Pro Ser Gly Thr Arg Pro Gly Pro Arg Arg Ala Thr 
50 55 60 

Ser Arg Asp Asp Arg Arg Ser Ala Thr Ser Val lie Pro Ser Arg Arg 
65 70 75 80 

Ser Val Ala Pro Arg Ala Glu Phe Gly Thr Arg Leu Ala Ser His Arg 
8 5 90 95 

•Ala Ser Pro Ser Asn Ala Cys Pro Val Arg lie Val Thr Ser Ala Ser 
100 105 110 

Gly Arg Pro lie Ser Ser Pro Pro lie Val Arg Ser Arg Ser Cys Val 
115 120 125 

Asp Lys Asn Gly Arg Arg Cys Ala Ser Glv Tyr Arg Arg Leu Asn Arg 
130 135 140 

Ala Arg Ser Ser Ser lie Ala Ala Arg Cys Arg Thr lie Gly Thr Phe 
145 150 ' 155 160 

Arg Arg Ser Arg Tyr Ser Ala Ser Met Arg Val Ser Thr Asn Ser Pro 
165 170 175 

His Val Thr His Gly Val Ala Pro Gly Val Thr Arg Arg lie Gly Gly 
180 185 190 



(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 196 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

Gin Glu Arg Pro Gin Met Cys Gin Arg Val Ser Glu lie Glu Pro Arg 
1 5 10 15 

Thr Gin Phe Phe Asn Arg Cys Ala Leu Pro His Tyr Trp His Phe Pro 
20 25 30 

Ala Val Ala Val Phe Ser Lys His Ala Ser Leu Asp Glu Leu Ala Pro 
35 40 45 

Ara Asn Pro Arg Arg Ser Ser Arg Arg Asp Ala Glu Asp Arg Arg Val 
50 55 60 

lie Phe Ala Ala Thr Leu Val Ala Val Asp Pro Pro Leu Arg Gly Ala 
65 70 75 80 

Glv Glv Glu Ala Asp Gin Leu lie Asp Leu Gly Val Cys Arg Arg Gin 
85 90 95 

Ala Gly Arg Val Arg Arg Gly Gin Glu Leu His His Arg His Arg His 
100 " 105 HO 

Gin Gly Ala Ala Pro Asp Leu Arg Arg Arg Arg Arg His Arg Arg Val 
115 120 125 

Gin Gin His Arg Arg Leu Gin Arg Val Arg Gin Leu Arg Arg Tyr Val 
130 " 135 140 

Gin Thr Ala His His Arg Arg Phe Ala Arg Thr Asp Arg Val Arg His 
145 150 155 160 

His Val Arg Gly Pro Ser Asn His Arg Arg Arg Arg Val Tyr Arg Gly 
165 170 175 

Arq His Ser Gly Ala Gly Gly Cys Pro Ala Gly Gly Ala Gly Ser Val 
180 185 190 

Gly Gly Ser Ala 
195 

(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

Val Arg Cys Gly Thr Leu Val Pro Val Pro Met Val Glu Phe Leu Thr 
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15 10 15 

Ser Thr Asn Ala Pro Ser Leu Pro Ser Ala Tyr Ala Glu Val Asp Lys 
20 25 - 30 

Leu lie Gly Leu Pro Ala Gly Thr Ala Lys Arg Trp lie Asn Gly Tyr 
35 4 0 4 5 

Glu Arg Gly Gly Lys Asp His Pro Pro lie Leu Arg Val Thr Pro Gly 
50 55 60 

Ala Thr Pro Trp Val Thr Trp Gly Glu Phe Val Glu Thr Arg Met Leu 
65 70 75 80 

Ala Glu Tyr Arg Asp Arg Arg Lys Val Pro He Val Arg Gin Arg Ala 
85 90 95 

Ala He Glu Glu Leu Arg Ala Arg Phe Asn Leu Arg Tyr Pro Leu Ala 
100 105 HO 

His Leu Arg Pro Phe Leu Ser Thr His Glu Arg Asp Leu Thr Met Gly 
115 120 125 

Gly Glu Glu He Gly Leu Pro Asp Ala Glu Val Thr He Arg Thr Gly 
130 135 140 

Gin Ala Leu Leu Gly Asp Ala Arg Trp Leu Ala Ser Leu Val Pro Asn 
145 150 155 160 

Ser Ala Arg Gly Ala Thr Leu Arg Arg Leu Gly lie Thr Asp Val Ala 
165 170 175 

Asp Leu Arg Ser Ser Arg Glu Val Ala Arg Arg Gly Pro Gly Arg Val 
180 185 190 

Pro Asp Gly He Asp Val His Leu Leu Pro Phe Pro Asp Leu Ala Asp 
195 200 205 

Asp Asp Ala Asp Asp Ser Ala Pro His Glu Thr Ala Phe Lys Arg Leu 
210 215 220 

Leu Thr Asn Asp Gly Ser Asn Gly Glu Ser Gly Glu Ser Ser Gin Ser 
225 230 235 240 

He Asn Asp Ala Ala Thr Arg Tyr Met Thr Asp Glu Tyr Arg Gin Phe 
245 250 ~ 255 

Pro Thr Arg Asn Gly Ala Gin Arg Ala Leu His Arg Val Val Thr Leu 
260 265 270 

Leu Ala Ala Gly Arg Pro Val Leu Thr His Cys Phe Ala Gly Lys Asp 
275 280 . 285 

Arg Thr Gly Phe Val Val Ala Leu Val Leu Glu Ala Val Gly Leu Asp 
290 295 300 

Arg Asp Val He Val Ala Asp 
305 310 

(2) INFORMATION FOR SEQ ID NO: 18 9: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2072 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 



CTCGTGCCGA 


TTCGGCACGA 


GCTGAGCAGC 


CCAAGGGGCC 


GTTCGGCGAA 


GTCATCGAGG 


60 


CATTCGCCGA 


CGGGCTGGCC 


GGCAAGGGTA 


AGCAAATCAA 


CACCACGCTG 


AACAGCCTGT 


120 


CGCAGGCGTT 


GAACGCCTTG 


AATGAGGGCC 


GCGGCGACTT 


CTTCGCGGTG 


GTACGCAGCC 


180 


TGGCGCTATT 


CGTCAACGCG 


CTACATCAGG 


ACGACCAACA 


GTTCGTCGCG 


TTGAACAAGA 


240 


ACCTTGCGGA 


GTTCACCGAC 


AGGTTGACCC 


ACTCCGATGC 


GGACCTGTCG 


AACGCCATCC 


300 


AGCAATTCGA 


CAGCTTGCTC 


GCCGTCGCGC 


GCCCGTTCTT 


CGCCAAGAAC 


CGCGAGGTGC 


360 


TGACGCATGA 


CGTCAATAAT 


CTCGCGACCG 


TGACCACCAC 


GTTGCTGCAG 


CCCGATCCGT 


420 


TGGATGGGTT 


GGAGACCGTC 


CTGCACATCT 


TCCCGACGCT 


GGCGGCGAAC 


ATTAACCAGC 


480 


TTTACCATCC 


GACACACGGT 


GGCGTGGTGT 


CGCTTTCCGC 


GTTCACGAAT 


TTCGCCAACC 


540 


CGATGGAGTT 


CATCTGCAGC 


TCGATTCAGG 


CGGGTAGCCG 


GCTCGGTTAT 


CAAGAGTCGG 


600 


CCGAACTCTG 


TGCGCAGTAT 


CTGGCGCCAG 


TCCTCGATGC 


GATCAAGTTC 


AACTACTTTC 


660 


CGTTCGGCCT 


GAACGTGGCC 


AGCACCGCCT 


CGACACTGCC 


TAAAGAGATC 


GCGTACTCCG 


720 


AGCCCCGCTT 


GCAGCCGCCC 


AACGGGTACA 


AG G AC AC C AC 


GGTGCCCGGC 


ATCTGGGTGC 


780 


CGGATACGCC 


GTTGTCACAC 


CGCAACACGC 


AGCCCGGTTG 


GGTGGTGGCA 


CCCGGGATGC 


840 


AAGGGGTTCA 


GGTGGGACCG 


ATCACGCAGG 


GTTTGCTGAC 


GCCGGAGTCC 


CTGGCCGAAC 


900 


t r a t n n r; t n 


TCCCGATATC 


GCCCCTCCGT 


CGTCAGGGCT 


GCAAACCCCG 


CCCGGACCCC ■ 


960 


CGAATGCGTA 


CGACGAGTAC 


CCCGTGCTGC 


CGCCGATCGG 


TTTACAGGCC 


CCACAGGTGC 


1020 


CGATACCACC 


GCCGCCTCCT 


GGGCCCGACG 


TAATCCCGGG 


TCCGGTGCCA 


CCGGTCTTGG 


1080 


CGGCGATCGT 


GTTCCCAAGA 


GATCGCCCGG 


CAGCGTCGGA 


AAACTTCGAC 


TACATGGGCC 


1140 


TCTTGTTGCT 


GTCGCCGGGC 


CTGGCGACCT 


TCCTGTTCGG 


GGTGTCATCT 


AGCCCCGCCC 


1200 


GTGGAACGAT 


GGCCGATCGG 


CACGTGTTGA 


TACCGGCGAT 


CACCGGCCTG 


GCGTTGATCG 


1260 


CGGCATTCGT 


CGCACATTCG 


TGGTACCGCA 


C AG AAC AT C C 


GCTCATAGAC 


ATGCGCTTGT 


1320 


TCCAGAACCG 


AGCGGTCGCG 


CAGGCCAACA 


TGACGATGAC 


GGTGCTCTCC 


CTCGGGCTGT 


1380 


TTGGCTCCTT 


CTTGCTGCTC 


CCGAGCTACC 


TCCAGCAAGT 


GTTGCACCAA 


TCACCGATGC 


1440 


AATCGGGGGT 


GCATATCATC 


CCACAGGGCC 


TCGGTGCCAT 


GCTGGCGATG 


CCGATCGCCG 


1500 


GAG C GAT GAT 


GGACCGACGG 


GGACCGGCCA 


AGATCGTGCT 


GGTTGGGATC 


ATGCTGATCG 


1560 
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CTGCGGGGTT 


GGGCACCTTC 


GCCTTTGGTG 


TCGCGCGGCA 


AGCGGACTAC 


TTACCCATTC 


1620 


TGCCGACCGG 


GCTGGCAATC 


ATGGGCATGG 


GCATGGGCTG 


CTCCATGATG 


CCACTGTCCG 


1680 


GGGCGGCAGT 


GCAGACCCTG 


GCCCCACATC 


AGATCGCTCG 


CGGTTCGACG 


CTGATCAGCG 


1740 


TCAACCAGCA 


GGTGGGCGGT 


TCGATAGGGA 


CCGCACTGAT 


GTCGGTGCTG 


CTCACCTACC 


1800 


AGTTCAATCA 


CAGCGAAATC 


ATCGCTACTG 


CAAAGAAAGT 


CGCACTGACC 


CCAGAGAGTG 


1860 


GCGCCGGGCG 


GGGGGCGGCG 


GTTGACCCTT 


CCTCGCTACC 


GCGCCAAACC 


AACTTCGCGG 


1920 


CCCAACTGCT 


GCATGACCTT 


TCGCACGCCT 


ACGCGGTGGT 


ATTCGTGATA 


GCGACCGCGC 


1980 


TAGTGGTCTC 


GACGCTGATC 


CCCGCGGCAT 


TCCTGCCGAA 


ACAGCAGGCT 


AGTCATCGAA 


2040 


GAGCACCGTT 


GCTATCCGCA 


TGACGTCTGC 


TT 






2072 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1923 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 



TCACCCCGGA 


GAAGTCGTTC 


GTCGACGACC 


TGGACATCGA 


CTCGCTGTCG 


ATGGTCGAGA 


60 


TCGCCGTGCA 


GACCGAGGAC 


AAGTACGGCG 


TCAAGATCCC 


CGACGAGGAC 


CTCGCCGGTC 


120 


TGCGTACCGT 


CGGTGACGTT 


GTCGCCTACA 


TCCAGAAGCT 


CGAGGAAGAA 


AACCCGGAGG 


180 


CGGCTCAGGC 


GTTGCGCGCG 


AAGATTGAGT 


CGGAGAACCC 


CGATGCGGCA 


CGAGCAGATC 


240 


GGTGCGTTTC 


ACCCACATCG 


CAAGCTCGAG 


ACGCCCGTCG 


TCCTCTTGCA 


CGCTCAGCCA 


300 


GGTTGGCGTG 


TCGCCGCCTT 


CCAGCAAGTG 


TTCCCACCAC 


ACGAAGGGAC 


CCTCGCGAAA 


360 


GGTGACTGAT 


CCGCGGACCA 


CATAGTCGAT 


GCCACCGTGG 


CTGACAATTG 


CGCCGGGTCC 


420 


GAGTTGGCGG 


GGGCCGAATT 


GCGGCATTGC 


GTCGAAGGCC 


AGCGGATCCC 


GGCGCCCGCC 


480 


CGGCGTGGCT 


GGTGTTTTGG 


GCCGCCGGAT 


GGCCACGACG 


AGAACGACGA 


TGGCGGCGAT 


540 


GAACAGCGCC 


ACGGCAATCA 


CGACCAGCAG 


ATTTCCCACG . 


CATACCCTCT 


CGTACCGCTG 


600 


CGCCGCGGTT 


GGTCGATCGG 


TCGCATATCG 


ATGGCGCCGT 


TTAACGTAAC 


AGCTTTCGCG 


660 


GGACCGGGGG 


TCACAACGGG 


CGAGTTGTCC 


GGCCGGGAAC 


CCGGCAGGTC 


TCGGCCGCGG 


720 


TCACCCCAGC 


TCACTGGTGC 


ACCATCCGGG 


TGTCGGTGAG 


CGTGCAACTC 


AAACACACTC 


780 


AACGGCAACG 


GTTTCTCAGG 


TCACCAGCTC 


AACCTCGACC 


CGCAATCGCT 


CGTACGTTTC 


840 


GACCGCGCGC 


AGGTCGCGAG 


TCAGCAGCTT 


TGCGCCGGCA 


GCTTTCGCCG 


TGAAGCCGAC 


900 
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CAGGGCATCG TAGGTTGCGC CACCGGTGAC ATCGTGCTCG GCGAGGTGGT CGGTCAAGCC 960 

GCGATATGAG CAGGCATCCA GTGCCAGGTA GTTGCTGGAG GTGATGTCCG CCAAGTAGGC 1020 

GTGGACGGCA ACAGGGGCAA TACGATGCGG CGGTGGTAGC CGGGTCAAGA CCGAATAGGT 1080 

TTCCACAGCC GCGTGCGCGA TCAGATGGAC GCCACGGTTG AGCGCGCGCA CGGCGGCCTC 114 0 

GTGCCCTTCG TGCCAGGTCG CGAATCCGGC AACCAGCACG CTGGTGTCTG GTGCGATCAC 1200 

CGCCGTGTGC GATCGAGCGT TTCCCGAACG ATTTCGTCGG TCAACGGGGG CAGGGGACGT 12 60 

TCTGGCCGTG CGACGAGAAC CGAGCCTTCC CGAACGAGTT CGACACCGGT CGGGGCCGGC 1320 

TCAATCTCGA TGCGCCCATC GCGCTCGGTG ATCTCCACCT GGTCGTTCCC GCGCAAGCCA 1380 

AGGCGCTCGC GAATCCGCTT GGGAATCACC AGACGTCCTG CGACATCGAT GGTTGTTCGC 14 4 0 

ATGGTAGGAA ATTTACCATC GCACGTTCCA TAGGCGTGTC CTGCGCGGGA TGTCGGGACG 1500 

ATCCGCTAGC GTATCGAACG ATTGTTTCGG AAATGGCTGA GGGAGCGTGC GGTGCGGGTG 1560 

ATGGGTGTCG ATCCCGGGTT GACCCGATGC GGGCTGTCGC TCATCGAGAG TGGGCGTGGT 1620 

CGGCAGCTCA CCGCGCTGGA TGTCGACGTG GTGCGCACAC CGTCGGATGC GGCCTTGGCG 1680 

CAGCGCCTGT TGGCCATCAG CGATGCCGTC GAGCACTGGC TGGACACCCA TCATCCGGAG 17 4 0 

GTGGTGGCTA TCGAACGGGT GTTCTCTCAG CTCAACGTGA CCACGGTGAT GGGCACCGCG 1800 

CAGGCCGGCG GCGTGATCGC CCTGGCGGCG GCCAAACGTG GTGTCGACGT GCATTTCCAT 1860 

ACCCCCAGCG AGGTCAAGGC GGCGGTCACT GGCAACGGTT CCGCAGACAA GGCTCAGGTC 1920 
ACC 

(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1055 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



1923 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

CTGGCGTGCC AGTGTCACCG GCGATATGAC GTCGGCATTC AATTTCGCGG CCCCGCCGGA 60 

CCCGTCGCCA CCCAATCTGG ACCACCCGGT CCGTCAATTG CCGAAGGTCG CCAAGTGCGT 120 

GCCCAATGTG GTGCTGGGTT TCTTGAACGA AGGCCTGCCG TATCGGGTGC CCTACCCCCA 180 

AACAACGCCA GTCCAGGAAT CCGGTCCCGC GCGGCCGATT CCCAGCGGCA TCTGCTAGCC 24 0 

GGGGATGGTT CAGACGTAAC GGTTGGCTAG GTCGAAACCC GCGCCAGGGC CGCTGGACGG 300 

GCTCATGGCA GCGAAATTAG AAAACCCGGG ATATTGTCCG CGGATTGTCA TACGATGCTG 360 
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AGTGCTTGGT GGTTCGTGTT TAGCCATTGA GTGTGGATGT GTTGAGACCC TGGCCTGGAA 4 20 

GGGGACAACG TGCTTTTGCC TCTTGGTCCG CCTTTGCCGC CCGACGCGGT GGTGGCGAAA 4 80 

CGGGCTGAGT CGGGAATGCT CGGCGGGTTG TCGGTTCCGC TCAGCTGGGG AGTGGCTGTG 54 0 

CCACCCGATG ATTATGACCA CTGGGCGCCT GCGCCGGAGG ACGGCGCCGA TGTCGATGTC 600 

CAGGCGGCCG AAGGGGCGGA CGCAGAGGCC GCGGCCATGG ACGAGTGGGA TGAGTGGCAG 660 

GCGTGGAACG AGTGGGTGGC GGAGAACGCT GAACCCCGCT TTGAGGTGCC ACGGAGTAGC 720 

AGCAGCGTGA TTCCGCATTC TCCGGCGGCC GGCTAGGAGA GGGGGCGCAG ACTGTCGTTA 7 80 

TTTGACCAGT GATCGGCGGT CTCGGTGTTC CCGCGGCCGG CTATGACAAC AGTCAATGTG 840 

CATGACAAGT TACAGGTATT AGGTCCAGGT TCAACAAGGA GACAGGCAAC ATGGCAACAC 900 

GTTTTATGAC GGATCCGCAC GCGATGCGGG ACATGGCGGG CCGTTTTGAG GTGCACGCCC 960 

AGACGG TGGA GGACGAGGCT CGCCGGATGT GGGCGTCCGC GCAAAACATC TCGGGNGCGG 1020 

GCTGGAGTGG CATGGCCGAG GCGACCTCGC TAGAC 1055 
(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 



CCGCCTCGTT 


GTTGGCATAC 


TCCGCCGCGG 


CCGCCTCGAC 


CGCACTGGCC 


GTGGCGTGTG 


60 


TCCGGGCTGA 


CCACCGGGAT 


CGCCGAACCA 


TCCGAGATCA 


CCTCGCAATG 


ATCCACCTCG 


120 


CGCAGCTGGT 


CACCCAGCCA 


CCGGGCGGTG 


TGCGACAGCG 


CCTGCATCAC 


CTTGGTATAG 


180 


CCGTCGCGCC 


CCAGCCGCAG 


GAAGTTGTAG 


TACTGGCCCA 


CCACCTGGTT 


ACCGGGACGG 


240 


GAGAAGTTCA 


GGGTGAAGGT 


CGGCATGTCG 


CCGCCGAGGT 


AGTTGACCCG 


G AAAAC C AG A 


300 


TCCTCCGGCA 


GGTGCTCGGG 


CCCGCGCCAC 


ACGACAAACC 


CGACGCCGGG 


ATAGGTCAG 


359 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

AACGGGCCCG TGGGCACCGC TCCTCTAAGG GCTCTCGTTG GTCGCATGAA GTGCTGGAAG 60 

GATGCATCTT GGCAGATTCC CGCCAGAGCA AAACAGCCGC TAGTCCTAGT CCGAGTCGCC 120 

CGCAAAGTTC CTCGAATAAC TCCGTACCCG GAGCGCCAAA CCGGGTCTCC TTCGCTAAGC 180 

TGCGCGAACC ACTTGAGGTT CCGGGACTCC TTGACGTCCA GACCGATTCG TTCGAGTGGC 24 0 

TGATCGGTTC GCCGCGCTGG CGCGAATCCG CCGCCGAGCG GGGTGATGTC AACCCAGTGG 300 

GTGGCCTGGA AGAGGTGCTC TACGAGCTGT CTCCGATCGA GGACTTCTCC 350 
(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Glu Gin Pro Lys Gly Pro Phe Gly Glu Val lie Glu Ala Phe Ala Asp 
15 10 15 

Gly Leu Ala Gly Lys Gly Lys Gin lie Asn Thr Thr Leu Asn Ser Leu 
20 " 25 30 

Ser Gin Ala Leu Asn Ala Leu Asn Glu Gly Arg Gly Asp Phe Phe Ala 
35 40 45 

Val Val Arg Ser Leu Ala Leu Phe Val Asn Ala Leu His Gin Asp Asp 
50 55 60 

Gin Gin Phe Val Ala Leu Asn Lys Asn Leu Ala Glu Phe Thr Asp Arg 
65 70 75 80 

Leu Thr His Ser Asp Ala Asp Leu Ser Asn Ala lie Gin Gin Phe Asp 
85 90 95 

Ser Leu Leu Ala Val Ala Arg Pro Phe Phe Ala Lys Asn Arg Glu Val 
100 105 110 

Leu Thr His Asp Val Asn Asn Leu Ala Thr Val Thr Thr Thr Leu Leu 
115 120 125 

Gin Pro Asp Pro Leu Asp Gly Leu Glu Thr Val Leu His lie Phe Pro 
130 135 14 0 

Thr Leu Ala Ala Asn lie Asn Gin Leu Tyr His Pro Thr His Gly Gly 
145 150 155 160 

Val Val Ser Leu Ser Ala Phe Thr Asn Phe Ala Asn Pro Met Glu Phe 
165 170 175 
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He Cys Ser Ser He Gin Ala Gly Ser Arg Leu Gly Tyr Gin Glu Ser 
180 185 190 

Ala Glu Leu Cys Ala Gin Tyr Leu Ala Pro Val Leu Asp Ala He Lys 
195 ~ 200 205 

Phe Asn Tyr Phe Pro Phe Gly Leu Asn Val Ala Ser Thr Ala Ser Thr 
210 215 220 

Leu Pro Lys Glu He Ala Tyr Ser Glu Pro Arg Leu Gin Pro Pro Asn 
225 230 235 240 

Gly Tyr Lys Asp Thr Thr Val Pro Gly He Trp Val Pro Asp Thr Pro 
245 250 255 

Leu Ser His Arg Asn Thr Gin Pro Gly Trp Val Val Ala Pro Gly Met 
260 265 270 

Gin Gly Val Gin Val Gly Pro He Thr Gin Gly Leu Leu Thr Pro Glu 
275 280 285 

Ser Leu Ala Glu Leu Met Gly Gly Pro Asp He Ala Pro Pro Ser Ser 
290 295 300 

Gly Leu Gin Thr Pro Pro Gly Pro Pro Asn Ala Tyr Asp Glu Tyr Pro 
305 310 315 320 

Val Leu Pro Pro He Gly Leu Gin Ala Pro Gin Val Pro He Pro Pro 
325 330 335 

Pro Pro Pro Gly Pro Asp Val He Pro Gly Pro Val Pro Pro Val Leu 
340 345 350 

Ala Ala He Val Phe Pro Arg Asp Arg Pro Ala Ala Ser Glu Asn Phe 
355 360 365 

Asp Tyr Met Gly Leu Leu Leu Leu Ser Pro Gly Leu Ala Thr Phe Leu 
370 ~ 375 380 

Phe Gly Val Ser Ser Ser Pro Ala Arg Gly Thr Met Ala Asp Arg His 
385 390 395 400 

Val Leu lie Pro Ala He Thr Gly Leu Ala Leu lie Ala Ala Phe Val 
405 410 415 

Ala His Ser Trp Tyr Arg Thr Glu His Pro Leu lie Asp Met Arg Leu 
420 425 430 

Phe Gin Asn Arg Ala Val Ala Gin Ala Asn Met Thr Met Thr Val Leu 
435 440 445 

Ser Leu Gly Leu Phe Gly Ser Phe Leu Leu Leu Pro Ser Tyr Leu Gin 
450 455 460 

Gin Val Leu His Gin Ser Pro Met Gin Ser Gly Val His He lie Pro 
465 470 475 480 

Gin Gly Leu Gly Ala Met Leu Ala Met Pro lie Ala Gly Ala Met Met 
485 490 495 

Asp Arg Arg Gly Pro Ala Lys lie Val Leu Val Gly He Met Leu lie 
500 505 510 
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Ala Ala Gly Leu Gly Thr Phe Ala Phe Gly Val Ala Arg Gin Ala Asp 
515 520 525 

Tyr Leu Pro lie Leu Pro Thr Gly Leu Ala lie Met Gly Met Gly Met 
530 535 540 

Gly Cys Ser Met Met Pro Leu Ser Gly Ala Ala Val Gin Thr Leu Ala 
545 550 555 560 

Pro His Gin lie Ala Arg Gly Ser Thr Leu lie Ser Val Asn Gin Gin 
565 570 575 

Val Gly Gly Ser lie Gly Thr Ala Leu Met Ser Val Leu Leu Thr Tyr 
580 585 590 

Gin Phe Asn His Ser Glu He He Ala Thr Ala Lys Lys Val Ala Leu 
595 600 605 

Thr Pro Glu Ser Gly Ala Gly Arg Gly Ala Ala Val Asp Pro Ser Ser 
610 615 620 

Leu Pro Arg Gin Thr Asn Phe Ala Ala Gin Leu Leu His Asp Leu Ser 
625 ^ 630 635 640 

His Ala Tyr Ala Val Val Phe Val He Ala Thr Ala Leu Val Val Ser 
645 650 655 

Thr Leu lie Pro Ala Ala Phe Leu Pro Lys Gin Gin Ala Ser His Arg 
660 665 670 

Arg Ala Pro Leu Leu Ser Ala 
675 

(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

Thr Pro Glu Lys Ser Phe Val Asp Asp Leu Asp He Asp Ser Leu Ser 
15 10 15 

Met Val Glu He Ala Val Gin Thr Glu Asp Lys Tyr Gly Val Lys He 
20 25 30 

Pro Asp Glu Asp Leu Ala Gly Leu Arg Thr Val Gly Asp Val Val Ala 
35 ^ 40 45 

Tyr He Gin Lys Leu Glu Glu Glu Asn Pro Glu Ala Ala Gin Ala Leu 
50 55 . 60 

Arg Ala Lys lie Glu Ser Glu Asn Pro Asp Ala Ala Arg Ala Asp Arg 
65 J 70 75 80 
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Cys Val Ser Pro Thr Ser Gin Ala Arg Asp Ala Arg Arg Pro Leu Ala 
8 5 90 95 

Arg Ser Ala Arg Leu Ala Cys Arg Arg Leu Pro Ala Ser Val Pro Thr 
100 105 110 

Thr Arg Arg Asp Pro Arg Glu Arg 
115 120 

(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

Leu Ala Cys Gin Cys His Arg Arg Tyr Asp Val Gly lie Gin Phe Arg 
15. 10 15 

Gly Pro Ala Gly Pro Val Ala Thr Gin Ser Gly Pro Pro Gly Pro Ser 
20 25 30 

lie Ala Glu Gly Arg Gin Val Arg Ala Gin Cys Gly Ala Gly Phe Leu 
35 4 0 4 5 

Glu Arg Arg Pro Ala Val Ser Gly Ala Leu Pro Pro Asn Asn Ala Ser 
50 55 60 

Pro Gly lie Arg Ser Arg Ala Ala Asp Ser Gin Arg His Leu Leu Ala 
65 70 75 80 

Gly Asp Gly Ser Asp Val Thr Val Gly 
85 

(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

Ala Ser Leu Leu Ala Tyr Ser Ala Ala Ala Ala Ser Thr Ala Leu Ala 
1 5 10 15 

Val Ala Cys Val Arg Ala Asp His Arg Asp Arg Arg Thr lie Arg Asp 
20 25 30 
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His Leu Ala Met He His Leu Ala Gin Leu Val Thr Gin Pro Pro Gly 
35 40 45 

Gly Val Arg Gin Arg Leu His His Leu Gly He Ala Val Ala Pro Gin 
50 J " 55 60 

Pro Gin Glu Val Val Val Leu Ala His His Leu Val Thr Gly Thr Gly 
65 70 75 80 

Glu Val Gin Gly Glu Gly Arg His Val Ala Ala Glu Val Val Asp Pro 
85 90 95 

Glu Asn Gin He Leu Arg Gin Val Leu Gly Pro Ala Pro His Asp Lys 
100 105 HO 

Pro Asp Ala Gly He Gly Gin 
115 

(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Arq Ala Arg Gly His Arg Ser Ser Lys Gly Ser Arg Trp Ser His Glu 
1 5 10 15 

Val Leu Glu Gly Cys He Leu Ala Asp Ser Arg Gin Ser Lys Thr Ala 
20 25 30 

Ala Ser Pro Ser Pro Ser Arg Pro Gin Ser Ser Ser Asn Asn Ser Val 
35 40 45 

Pro Gly Ala Pro Asn Arg Val Ser Phe Ala Lys Leu Arg Glu Pro Leu 
50 55 60 

Glu Val Pro Gly Leu Leu Asp Val Gin Thr Asp Ser Phe Glu Trp Leu 
65 J 70 75 80 

He Gly Ser Pro Arg Trp Arg Glu Ser Ala Ala Glu Arg Gly Asp Val 
85 90 95 

Asn Pro Val Gly Gly Leu Glu Glu Val Leu Tyr Glu Leu Ser Pro He 
100 105 HO 



Glu Asp Phe Ser 
115 

(2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 811 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 



TGCTACGCAG 


CAATCGCTTT 


GGTGACAGAT 


GTGGATGCCG 


GCGTCGCTGC 


TGGCGATGGC 


60 


GTGAAAGCCG 


CCGACGTGTT 


CGCCGCATTC 


GGGGAGAACA 


TCGAACTGCT 


CAAAAGGCTG 


120 


GTGCGGGCCG 


CCATCGATCG 


GGTCGCCGAC 


GAGCGCACGT 


GCACGCACTG 


TCAACACCAC 


180 


GCCGGTGTTC 


CGTTGCCGTT 


CGAGCTGCCA 


TGAGGGTGCT 


GCTGACCGGC 


GCGGCCGGCT 


240 


TCATCGGGTC 


GCGCGTGGAT 


GCGGCGTTAC 


GGGCTGCGGG 


TCACGACGTG 


GTGGGCGTCG 


300 


ACGCGCTGCT 


GCCCGCCGCG 


CACGGGCCAA 


ACCCGGTGCT 


GCCACCGGGC 


TGCCAGCGGG 


360 


TCGACGTGCG 


CGACGCCAGC 


GCGCTGGCCC 


CGTTGTTGGC 


CGGTGTCGAT 


CTGGTGTGTC 


420 


ACCAGGCCGC 


CATGGTGGGT 


GCCGGCGTCA 


ACGCCGCCGA 


CGCACCCGCC 


TATGGCGGCC 


480 


ACAACGATTT 


CGCCACCACG 


GTGCTGCTGG 


CGCAGATGTT 


CGCCGCCGGG 


GTCCGCCGTT 


540 


TGGTGCTGGC 


GTCGTCGATG 


GTGGTTTACG 


GGCAGGGGCG 


CTATGACTGT 


CCCCAGCATG 


600 


GACCGGTCGA 


CCCGCTGCCG 


CGGCGGCGAG 


CCGACCTGGA 


CAATGGGGTC 


TTCGAGCACC 


660 


GTTGCCCGGG 


GTGCGGCGAG 


CCAGTCATCT 


GGCAATTGGT 


CGACGAAGAT 


GCCCCGTTGC 


720 


GCCCGCGCAG 


CCTGTACGCG 


GCAGCAAGAC 


CGCGCAGGAG 


CACTACGCGC 


TGGCGTGGTC 


780 


GGAAACGAAT 


GGCGGTTCCG 


TGGTGGCGTT 


G 






811 



(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 966 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

GTCCCGCGAT GTGGCCGAGC ATGACTTTCG GCAACACCGG CGTAGTAGTC GAAGATATCG 60 

GACTTTGTGG TCCCGGTGGC GGGATAGAGC ACCTGTCGGC GTTGGTCAGC GTCACCCGTT 120 

GCTCGGACGC CGAACCCATG CTTTCAACGT AGCCTGTCGG TCACACAAGT CGCGAGCGTA 180 

ACGTCACGGT CAAATATCGC GTGGAATTTC GCCGTGACGT TCCGCTCGCG GACAATCAAG 24 0 

GCATACTCAC TTACATGCGA GCCATTTGGA CGGGTTCGAT CGCCTTCGGG CTGGTGAACG 300 

TGCCGGTCAA GGTGTACAGC GCTACCGCAG ACCACGACAT CAGGTTCCAC CAGGTGCACG 3 60 
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CCAAGGACAA CGGACGCATC CGGTACAAGC GCGTCTGCGA GGCGTGTGGC GAGGTGGTCG 4 20 

ACTACCGCGA TCTTGCCCGG GCCTACGAGT CCGGCGACGG CCAAATGGTG GCGATCACCG 4 80 

ACGACGACAT CGCCAGCTTG CCTGAAGAAC GCAGCCGGGA GATCGAGGTG TTGGAGTTCG 54 0 

TCCCCGCCGC CGACGTGGAC CCGATGATGT TCGACCGCAG CTACTTTTTG GAGCCTGATT 600 

CGAAGTCGTC GAAATCGTAT GTGCTGCTGG CTAAGACACT CGCCGAGACC GACCGGATGG 660 

CGATCGTGGA TCGCCCCACC GGCCGTGAAT GCAGGAAAAA TAAGAGCCGC TATCCACAAT 720 

TCGGCGTCGA GCTCGGCTAC CACAAACGGT AGAACGATCG AG AC AT T C C C GAGCTGAAGT 7 80 

GCGGCGCTAT AGAAGCCGCT CTGCGCGATT ATCAAACGCA AAATACGCTT AC T CAT G CCA 84 0 

TCGGCGCTGC TCACCCGATG CGACGTTTTT GCCACGCTCC ACCGCCTGCC GCGCGACCTC 900 

AAGTGGGCAT GCATCCCACC CGTTCCCGGA AACCGGTTCC GGCGGGTCGG CTCATCGCTT 960 
CATCCT 

(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



966 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 



CCGCACCGCC 


GGCAATACCG 


CCAGCGCCAC 


CGTTACCGCC 


GTTTGCGCCG 


TTGCCCCCGT 


60 


TGCCGCCCGT 


CCCGCCGGCC 


CCGCCGATGG 


AGTTCTCATC 


GCCAAAAGTA 


CTGGCGTTGC 


120 


CACCGGAGCC 


GCCGTTGCCG 


CCGTCACCGC 


CAGCCCCGCC 


GACTCCACCG 


GCCCCACCGA 


180 


CTCCGCCGCT 


GCCACCGTTG 


CCGCCGTTGC 


CGATCAACAT 


GCCGCTGGCG 


CCACCCTTGC 


240 


CACCCACGCC 


ACCGGCTCCG 


CCCACCCCGC 


CGACACCAAG 


CGAGCTGCCG 


CCGGAGCCAC 


300 


CATCACCACC 


TACGCCACCG 


ACCGCCCAGA 


CACCAGCGAC 


CGGGTCTTCG 


TGAAACGTCG 


360 


CGGTGCCACC 


ACCGCCGCCG 


TTACCGCCAA 


CCCCACCGGC 


AACGCCGGCG 


CCGCCATCCC 


420 


CGCCGGCCCC 


GGCGTTGCCG 


CCGTTGCCGC 


CGTTGCCGAA 


CAACAACCCG 


CCGGCGCCGC 


480 


CGTTGCCGCC 


CGCGCCGCCG 


GTCCCGCCGG 


CGCCGCCGAC 


GCCAAGGCCG 


CTGCCGCCCT 


540 


TGCCGCCATC 


ACCACCCTTG 


CCGCCGACCA 


CATCGGGTTC 


TGCCTCGGGG 


TCTGGGCTGT 


600 


CAAACCTCGC 


GATGCCAGCG 


TTGCCGCCGC 


TTCCCCCGGG 


CCCCCCCGTG 


GCGCCGTCAC 


660 


CACCGATACC 


ACCCGCGCCA 


CCGGCGCCAC 


CGTTGCCGCC 


ATCACCGAAT 


AGCAACCCGC 


720 


CGGCGCCACC 


ATTGCCGCCA 


GCTCCCCCTG 


CGCCACCGTC 


GGCGCCGGAG 


GCGGCACTGG 


780 
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CAGCCCCGTT 


ACCACCGAAA 


CCGCCGCTAC 


CACCGGTAGA 


GGTGGCAGTG 


GCGATGTGTA 


840 


CGAAAGCGCC 


GCCTCCGGCG 


CCGCCGCTAC 


CACCCCCACT 


GCCGGCGGCT 


ACACCGTCGG 


900 


ACCCGTTGCC 


ACCATCACCG 


CCAAAGGCGC 


TCGCAATGTC 


GCCCTGCGCG 


ACTCCGCCGT 


960 


CGCCGCCGTT 


GCCGCCGCCG 


CCACCGGCAG 


CGGCGGTACC 


GCCGTCACCA 


CCGGCACCGC 


1020 


CGGTGGCCTT 


GCCCGAGCCT 


GCCGTCGCGG 


TGGCACCGTC 


GCCGCCGGTG 


CCACCGGTCG 


1080 


GCGTGCCGGC 


AGTGCCATGG 


CCGCCCGTGC 


CGCCGTCGCC 


GCCGGTTTGA 


TCACCGATGC 


1140 


CGGACACATC 


TGCCGGGCTG 


TCCCCGGTGC 


TGGCCGCGGG 


GCCGGGCGTG 


GGATTGACCC 


1200 


CGTTTGCCCC 


GGCGAGGCCG 


GCGCCGCCGG 


TACCACCGGC 


GCCGCCATGG 


CCGAACAGCC 


1260 


CGGCGTTGCC 


GCCGTTACCG 


CCCGCACCCC 


CGATGCCTGC 


GGCCACGCTG 


GTGCCGCCGA 


1320 


CACCGCCGTT 


GCCGCCGTTG 


CCCCACAACC 


ACCCCCCGTT 


CCCACCGGCA 


CCGCCGGCCG 


1380 


CGCCGGTACC 


ACCGGCCCCG 


CCGTTGCCGC 


CGTTGCCGAT 


CAACCCGGCC 


GCGCCTCCGC 


1440 


TGCCGCCGGT 


TTGACCGAAC 


CCGCCAGCCG 


CGCCGTTGCC 


ACCGTTGCCA 


AACAGCAACC 


1500 


CGCCGGCCGC 


GCCAGGCTGC 


CCGGGTGCCG 


TCCCGTCGGC 


GCCGTTTCCG 


ATCAACGGGC 


1560 


GCCCCAAAAG 


CGCCTCGGTG 


GGCGCATTCA 


CCGCACCCAG 


CAGACTCCGC 


TCAACAGCGG 


1620 


CTTCAGTGCT 


GGCATACCGA 


CCCGCGGCCG 


CAGTCAACGC 


CTGCACAAAC 


TGCTCGTGAA 


1680 


ACGCTGCCAC 


CTGTACGCTG 


AGCGCCTGAT 


ACTGCCGAGC 


ATGGGCCCCG 


AACAACCCCG 


1740 


CAATCGCCGC 


CGACACTTCA 


TCGGCAGCCG 


CAGCCACCAC 


TTCCGTCGTC 


GGGATCGCCG 


1800 


CGGCCGCATT 


AGCCGCGCTC 


ACCTGCGAAC 


CAATAGTCGA 


TAAATCCAAA 


GCCGCAGTTG 


1860 


CCAGCAGCTG 


CGGCGTCGCG 


ATCACCAAGG 


ACACCTCGCA 


CCTCCGGATA 


CCCCATATCG 


1920 


CCGCACCGTG 


TCCCCAGCGG 


CCACGTGACC 


TTTGGTCGCT 


GGCTGGCGGC 


CCTGACTATG 


1980 


GCCGCGACGG 


CCCTCGTTCT 


GATTCGCCCC 


GGCGCGCAGC 


TTGTTGCGCG 


AGTTGAAGAC 


2040 


GGGAGGACAG 


GCCGAGCTTG 


GTGTAGACGT 


GGGTCAAGTG 


GGAATGCACG 


GTCCGCGGCG 


2100 


AGATGAATAG 


GCGGACGCCG 


ATCTCCTTGT 


TGCTGAGTCC 


CTCACCGACC 


AGTAGAGCCA 


2160 


CCTCAAGCTC 


TGTCGGTGTC 


AACGCGCCCC 


AGCCACTTGT 


CGGGCGTTTC 


CGTGCACCGC 


2220 


GGCCTCGTTG 


CGCGTACGCG 


ATCGCCTCAT 


CGATCGATAA 


CGCAGTTCCT 


TCGGCCCAGG 


2280 


CATCGTCGAA 


CTCGCTGTCA 


CCCATGGATT 


TTCGAAGGGT 


GGCTAGCGAC 


GAGTTACAGC 


2340 


CCGCCTGGTA 


GATCCCGAAG 


CGGhCCG 








2367 



(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:202: 

Gin Pro Ala Gly Ala Thr He Ala Ala Ser Ser Pro Cys Ala Thr Val 
1 5 10 15 

Glv Ala Gly Gly Gly Thr Gly Ser Pro Val Thr Thr Glu Thr Ala Ala 
20 ' 25 30 

Thr Thr Gly Arg Gly Gly Ser Gly Asp Val Tyr Glu Ser Ala Ala Ser 
35 40 45 

Gly Ala Ala Ala Thr Thr Pro Thr Ala Gly Gly Tyr Thr Val Gly Pro 
50 55 60 

Val Ala Thr He Thr Ala Lys Gly Ala Arg Asn Val Ala Leu Arg Asp 
65 7 0 7 5 80 

Ser Ala Val Ala Ala Val Ala Ala Ala Ala Thr Gly Ser Gly Gly Thr 
8 5 90 95 

Ala Val Thr Thr Gly Thr Ala Gly Gly Leu Ala Arg Ala Cys Arg Arg 
100 " 105 HO 

Glv Glv Thr Val Ala Ala Gly Ala Thr Gly Arg Arg Ala Gly Ser Ala 
115 120 125 

Met Ala Ala Arg Ala Ala Val Ala Ala Gly Leu He Thr Asp Ala Gly 
130 135 140 

His He Cys Arg Ala Val Pro Gly Ala Gly Arg Gly Ala Gly Arg Gly 
145 150 155 160 

He Asp Pro Val Cys Pro Gly Glu Ala Gly Ala Ala Gly Thr Thr Gly 
165 " 170 175 

Ala Ala Met Ala Glu Gin Pro Gly Val Ala Ala Val Thr Ala Arg Thr 
180 185 190 

Pro Asp Ala Cys Gly His Ala Gly Ala Ala Asp Thr Ala Val Ala Ala 
195 200 205 

Val Ala Pro Gin Pro Pro Pro Val Pro Thr Gly Thr Ala Gly Arg Ala 
210 215 220 

Glv Thr Thr Glv Pro Ala Val Ala Ala Val Ala Asp Gin Pro Gly Arg 
225 230 235 240 

Ala Ser Ala Ala Ala Gly Leu Thr Glu Pro Ala Ser Arg Ala Val Ala 
245 " 250 255 

Thr Val Ala Lys Gin Gin Pro Ala Gly Arg Ala Arg Leu Pro Gly Cys 
260 265 270 

Arq Pro Val Gly Ala Val Ser Asp Gin Arg Ala Pro Gin Lys Arg Leu 
275 280 285 

Gly Gly Arg He His Arg Thr Gin Gin Thr Pro Leu Asn Ser Gly Phe 
290 295 300 
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Ser Ala Gly lie Pro Thr Arg Gly Arg Ser Gin Arg Leu His Lys Leu 
305 310 315 320 

Leu Val Lys Arg Cys His Leu Tyr Ala Glu Arg Leu lie Leu Pro Ser 
325 ^ 330 335 

Met Gly Pro Glu Gin Pro Arg Asn Arg Arg Arg His Phe lie Gly Ser 
340 345 350 

Arg Ser His His Phe Arg Arg Arg Asp Arg Arg Gly Arg lie Ser Arg 
355 360 ~ 365 

Ala His Leu Arg Thr Asn Ser Arg 
370 "* 375 

(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2852 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 



GGCCAAAACG 


CCCCGGCGAT 


CGCGGCCACC 


GAGGCCGCCT 


ACGACCAGAT 


GTGGGCCCAG 


60 


GACGTGGCGG 


CGATGTTTGG 


CTACCATGCC 


GGGGCTTCGG 


CGGCCGTCTC 


GGCGTTGACA 


120 


CCGTTCGGCC 


AGGCGCTGCC 


GACCGTGGCG 


GGCGGCGGTG 


CGCTGGTCAG 


CGCGGCCGCG 


180 


GCTCAGGTGA 


CCACGCGGGT 


CTTCCGCAAC 


CTGGGCTTGG 


CGAACGTCCG 


CGAGGGCAAC 


240 


GTCCGCAACG 


GTAATGTCCG 


GAACTTCAAT 


CTCGGCTCGG 


CCAACATCGG 


CAACGGCAAC 


300 


ATCGGCAGCG 


GCAACATCGG 


CAGCTCCAAC 


ATCGGGTTTG 


GCAACGTGGG 


TCCTGGGTTG 


360 


ACCGCAGCGC 


TGAACAACAT 


CGGTTTCGGC 


AACACCGGCA 


GCAACAACAT 


CGGGTTTGGC 


420 


AACACCGGCA 


GCAACAACAT 


CGGGTTCGGC 


AATACCGGAG 


ACGGCAACCG 


AGGTATCGGG 


480 


CTCACGGGTA 


GCGGTTTGTT 


GGGGTTCGGC 


GGCCTGAACT 


CGGGCACCGG 


CAACATCGGT 


540 


CTGTTCAACT 


CGGGCACCGG 


AAACGTCGGC 


ATCGGCAACT 


CGGGTACCGG 


GAACTGGGGC 


600 


ATTGGCAACT 


CGGGCAACAG 


CTACAACACC 


GGTTTTGGCA 


ACTCCGGCGA 


CGCCAACACG 


660 


GGCTTCTTCA 


ACTCCGGAAT 


AGCCAACACC 


GGCGTCGGCA 


ACGCCGGCAA 


CTACAACACC 


720 


GGTAGCTACA 


ACCCGGGCAA 


CAGCAATACC 


GGCGGCTTCA 


ACATGGGCCA 


GTACAACACG 


780 


GGCTACCTGA 


ACAGCGGCAA 


CTACAACACC 


GGCTTGGCAA 


ACTCCGGCAA 


TGTCAACACC 


840 


GGCGCCTTCA 


TTACTGGCAA 


CTTCAACAAC 


GGCTTCTTGT 


GGCGCGGCGA 


CCACCAAGGC 


900 


CTGATTTTCG 


GGAGCCCCGG 


CTTCTTCAAC 


TCGACCAGTG 


CGCCGTCGTC 


GGGATTCTTC 


960 
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AACAGCGGTG CCGGTAGCGC GTCCGGCTTC CTGAACTCCG GTGCCAACAA TTCTGGCTTC 1020 

TTCAACTCTT CGTCGGGGGC CATCGGTAAC TCCGGCCTGG CAAACGCGGG CGTGCTGGTA 1080 

TCGGGCGTGA TCAACTCGGG CAACACCGTA TCGGGTTTGT TCAACATGAG CCTGGTGGCC 114 0 

ATCACAACGC CGGCCTTGAT CTCGGGCTTC TTCAACACCG GAAGCAACAT GTCGGGATTT 1200 

TTCGGTGGCC CACCGGTCTT CAATCTCGGC CTGGCAAACC GGGGCGTCGT GAACATTCTC 12 60 

GGCAACGCCA ACATCGGCAA TTACAACATT CTCGGCAGCG GAAACGTCGG TGACTTCAAC 1320 

ATCCTTGGCA GCGGCAACCT CGGCAGCCAA AACATCTTGG GCAGCGGCAA CGTCGGCAGC 138 0 

TTCAATATCG GCAGTGGAAA CATCGGAGTA TTCAATGTCG GTTCCGGAAG CCTGGGAAAC 14 4 0 

TACAACATCG GATCCGGAAA CCTCGGGATC TACAACATCG GTTTTGGAAA CGTCGGCGAC 1500 

TACAACGTCG GCTTCGGGAA CGCGGGCGAC TTCAACCAAG GCTTTGCCAA CACCGGCAAC 15 60 

AACAACATCG GGTTCGCCAA CACCGGCAAC AACAACATCG GCATCGGGCT GTCCGGCGAC 1620 

AACCAGCAGG GCTTCAATAT TGCTAGCGGC TGGAACTCGG GCACCGGCAA CAGCGGCCTG 1680 

TTCAATTCGG GCACCAATAA CGTTGGCATC TTCAACGCGG G C AC CG G AAA CGTCGGCATC 17 4 0 

GCAAACTCGG GCACCGGGAA CTGGGGTATC GGGAACCCGG GTACCGACAA TACCGGCATC 18 00 

CTCAATGCTG GCAGCTACAA CACGGGCATC CTCAACGCCG GCGACTTCAA CACGGGCTTC 18 60 

TACAACACGG GCAGCTACAA CACCGGCGGC TTCAACGTCG GTAACACCAA CACCGGCAAC 1920 

TTCAACGTGG GTGACACCAA TACCGGCAGC TATAACCCGG GTGACACCAA CACCGGCTTC 1980 

TTCAATCCCG GCAACGTCAA TACCGGCGCT TTCGACACGG GCGACTTCAA CAATGGCTTC 204 0 

TTGGTGGCGG GCGATAACCA GGGCCAGATT GCCATCGATC TCTCGGTCAC CACTCCATTC 2100 

ATCCCCATAA ACGAGCAGAT GGTCATTGAC GTACACAACG TAATGACCTT CGGCGGCAAC 2160 
ATGATCACGG TCACCGAGGC CTCGACCGTT TTCCCCCAAA CCTTCTATCT GAGCGGTTTG 2220 
TTCTTCTTCG GCCCGGTCAA TCTCAGCGCA TCCACGCTGA CCGTTCCGAC GATCACCCTC 2280 
ACCATCGGCG GACCGACGGT GACCGTCCCC ATCAGCATTG TCGGTGCTCT GGAGAGCCGC 234 0 
ACGATTACCT T CCTCAAG AT CGATCCGGCG CCGGGCATCG GAAATTCGAC CACCAACCCC 24 00 
TCGTCCGGCT TCTTCAACTC GGGCACCGGT GGCACATCTG GCTTCCAAAA CGTCGGCGGC 24 60 
GGCAGTTCAG GCGTCTGGAA CAGTGGTTTG AGCAGCGCGA TAGGGAATTC GGGTTTCCAG 2520 
AACCTCGGCT CGCTGCAGTC AGGCTGGGCG AACCTGGGCA ACTCCGTATC GGGCTTTTTC 2580 
AACACCAGTA CGGTGAACCT CTCCACGCCG GCCAATGTCT CGGGCCTGAA CAACATCGGC 2640 
ACCAACCTGT CCGGCGTGTT CCGCGGTCCG ACCGGGACGA TTTTCAACGC GGGCCTTGCC 2700 
AACCTGGGCC AGTTGAACAT CGGCAGCGCC TCGTGCCGAA TTCGGCACGA GTTAGATACG 27 60 
GTTTCAACAA TCATATCCGC GTTTTGCGGC AGTGCATCAG ACGAATCGAA CCCGGGAAGC 2820 
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GTAAGCGAAT AAACCGAATG GCGGCCTGTC AT 



2852 



(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 943 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Gly Gin Asn Ala Pro Ala lie Ala Ala Thr Glu Ala Ala Tyr Asp Gin 
15 10 15 

Met Trp Ala Gin Asp Val Ala Ala Met Phe Gly Tyr His Ala Gly Ala 
20 25 30 

Ser Ala Ala Val Ser Ala Leu Thr Pro Phe Gly Gin Ala Leu Pro Thr 
35 4 0 45 

Val Ala Gly Gly Gly Ala Leu Val Ser Ala Ala Ala Ala Gin Val Thr 
50 55 60 

Thr Arg Val Phe Arg Asn Leu Gly Leu Ala Asn Val Arg Glu Gly Asn 
65 70 75 80 

Val Arg Asn Gly Asn Val Arg Asn Phe Asn Leu Gly Ser Ala Asn lie 
85 90 95 

Gly Asn Gly Asn lie Gly Ser Gly Asn lie Gly Ser Ser Asn lie Gly 
100 105 110 

Phe Gly Asn Val Gly Pro Gly Leu Thr Ala Ala Leu Asn Asn lie Gly 
115 120 125 

Phe Gly Asn Thr Gly Ser Asn Asn lie Gly Phe Gly Asn Thr Gly Ser 
130 135 140 

Asn Asn lie Gly Phe Gly Asn Thr Gly Asp Gly Asn Arg Gly lie Gly 
145 150 155 " 160 

Leu Thr Gly Ser Gly Leu Leu Gly Phe Gly Gly Leu Asn Ser Gly Thr 
165 170 175 

Gly Asn lie Gly Leu Phe Asn Ser Gly Thr Gly Asn Val Gly lie Gly 
180 185 190 

Asn Ser Gly Thr Gly Asn Trp Gly lie Gly Asn Ser Gly Asn Ser Tyr 
195 200 205 

Asn Thr Gly Phe Gly Asn Ser Gly Asp Ala Asn Thr Gly Phe Phe Asn 
210 215 220 

Ser Gly lie Ala Asn Thr Gly Val Gly Asn Ala Gly Asn Tyr Asn Thr 
225 230 235 ^ 240 

Gly Ser Tyr Asn Pro Gly Asn Ser Asn Thr Gly Gly Phe Asn Met Gly 
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Gin Tyr Asn Thr 
2 60 



Ala Asn Ser Gly 
275 

Asn Asn Gly Phe 
290 

Ser Pro Gly Phe 
305 

Asn Ser Gly Ala 



Asn Ser Gly Phe 
340 



Leu Ala Asn Ala 
355 

Thr Val Ser Gly 
370 

Ala Leu lie Ser 
385 

Phe Gly Gly Pro 



Val Asn lie Leu 
420 



Ser Gly Asn Val 
435 

Ser Gin Asn lie 
450 

Ser Gly Asn lie 
465 

Tyr Asn lie Gly 



Asn Val Gly Asp 
500 

Gin Gly Phe Ala 
515 



Gly Asn Asn Asn 
530 

Phe Asn lie Ala 
545 

Phe Asn Ser Gly 



245 

Gly Tyr Leu Asn 



Asn Val Asn Thr 
280 



Leu Trp Arg Gly 
295 

Phe Asn Ser Thr 
310 

Gly Ser Ala Ser 

325 

Phe Asn Ser Ser 



Gly Val Leu Val 
360 



Leu Phe Asn Met 
375 

Gly Phe Phe Asn 
390 

Pro Val Phe Asn 
405 

Gly Asn Ala Asn 



Gly Asp Phe Asn 
4 4 0 



Leu Gly Ser Gly 
455 

Gly Val Phe Asn 
470 

Ser Gly Asn Leu 
485 

Tyr Asn Val Gly 



Asn Thr Gly Asn 
520 

He Gly He Gly 
535 

Ser Gly Trp Asn 
550 

Thr Asn Asn Val 
565 



250 

Ser Gly Asn Tyr 
265 

Gly Ala Phe He 



Asp His Gin Gly 
300 



Ser Ala Pro Ser 
315 

Gly Phe Leu Asn 
330 

Ser Gly Ala He 
345 

Ser Gly Val He 



Ser Leu Val Ala 
380 



Thr Gly Ser Asn 
395 

Leu Gly Leu Ala 
410 

He Gly Asn Tyr 
425 

He Leu Gly Ser 



Asn Val Gly Ser 
460 



Val Gly Ser Gly 
475 

Gly He Tyr Asn 
490 

Phe Gly Asn Ala 
505 

Asn Asn He Gly 



Leu Ser Gly Asp 
540 



Ser Gly Thr Gly 
555 



Gly He Phe Asn 
570 



255 

Asn Thr Gly Leu 
270 

Thr Gly Asn Phe 
285 

Leu He Phe Gly 



Ser Gly Phe Phe 
320 



Ser Gly Ala Asn 
335 



Gly Asn Ser Gly 
350 

Asn Ser Gly Asn 
365 

He Thr Thr Pro 



Met Ser Gly Phe 
400 



Asn Arg Gly Val 
415 

Asn He Leu Gly 
430 

Gly Asn Leu Gly 
445 

Phe Asn He Gly 



Ser Leu Gly Asn 
480 



He Gly Phe Gly 
495 

Gly Asp Phe Asn 
510 

Phe Ala Asn Thr 
525 

Asn Gin Gin Gly 



Asn Ser Gly Leu 
560 



Ala Gly Thr Gly 
575 
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Asn Val Gly lie Ala Asn Ser Gly Thr Gly Asn Trp Gly lie Gly Asn 
580 585 590 

Pro Gly Thr Asp Asn Thr Gly lie Leu Asn Ala Gly Ser Tyr Asn Thr 
595 600 " 605 

Gly lie Leu Asn Ala Gly Asp Phe Asn Thr Gly Phe Tyr Asn Thr Gly 
610 615 620 

Ser Tyr Asn Thr Gly Gly Phe Asn Val Gly Asn Thr Asn Thr Gly Asn 
625 630 635 640 

Phe Asn Val Gly Asp Thr Asn Thr Gly Ser Tyr Asn Pro Gly Asp Thr 
645 650 " 655 

Asn Thr Gly Phe Phe Asn Pro Gly Asn Val Asn Thr Gly Ala Phe Asp 
660 665 " 670 

Thr Gly Asp Phe Asn Asn Gly Phe Leu Val Ala Gly Asp Asn Gin Gly 
675 680 685 

Gin .lie Ala lie Asp Leu Ser Val Thr Thr Pro Phe He Pro He Asn 
690 695 700 

Glu Gin Met Val lie Asp Val His Asn Val Met Thr Phe Gly Gly Asn 
705 710 715 720 

Met lie Thr Val Thr Glu Ala Ser Thr Val Phe Pro Gin Thr Phe Tyr 
725 730 735 

Leu Ser Gly Leu Phe Phe Phe Gly Pro Val Asn Leu Ser Ala Ser Thr 
740 745 750 

Leu Thr Val Pro Thr He Thr Leu Thr He Gly Gly Pro Thr Val Thr 
755 760 765 

Val Pro lie Ser He Val Gly Ala Leu Glu Ser Arg Thr He Thr Phe 
770 775 780 

Leu Lys He Asp Pro Ala Pro Gly He Gly Asn Ser Thr Thr Asn Pro 
785 790 795 800 

Ser Ser Gly Phe Phe Asn Ser Gly Thr Gly Gly Thr Ser Gly Phe Gin 
805 810 815 

Asn Val Gly Gly Gly Ser Ser Gly Val Trp Asn Ser Gly Leu Ser Ser 
820 825 ' 830 

Ala He Gly Asn Ser Gly Phe Gin Asn Leu Gly Ser Leu Gin Ser Gly 
835 840 845 

Trp Ala Asn Leu Gly Asn Ser Val Ser Gly Phe Phe Asn Thr Ser Thr 
850 855 860 

Val Asn Leu Ser Thr Pro Ala Asn Val Ser Gly Leu Asn Asn He Gly 
865 870 875 880 

Thr Asn Leu Ser Gly Val Phe Arg Gly Pro Thr Gly Thr He Phe Asn 
885 890 895 

Ala Gly Leu Ala Asn Leu Gly Gin Leu Asn He Gly Ser Ala Ser Cys 
900 905 910 
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Arg lie Arg His Glu Leu Asp Thr Val Ser Thr lie lie Ser Ala Phe 
915 920 925 

Cys Gly Ser Ala Ser Asp Glu Ser Asn Pro Gly Ser Val Ser Glu 
930 935 940 

(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 
GGATCCATAT GGGCCATCAT CAT CATC AT C ACGTGATCGA CATCATCGGG ACC 
(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 
CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 
(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 
(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 
CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 31 
(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 

GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 33 

(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 38 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:210: 
GGATATCTGC AGAATTCAGG TTTAAAGCCC ATTTGCGA 38 
(2) INFORMATION FOR SEQ ID NO:211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 
CCGCATGCGA GCCACGTGCC CACAACGGCC 30 
(2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 
CTTCATGGAA TTCTCAGGCC GGTAAGGTCC GCTGCGG 
(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 7676 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 
TGGCGAATGG GACGCGCCCT GTAGCGGCGC ATTAAGCGCG GCGGGTGTGG TGGTTACGCG 60 

CAGCGTGACC GCTACACTTG CCAGCGCCCT AGCGCCCGCT CCTTTCGCTT TCTTCCCTTC 12 0 

CTTTCTCGCC ACGTTCGCCG GCTTTCCCCG TCAAGCTCTA AATCGGGGGC TCCCTTTAGG 180 

GTTCCGATTT AGTGCTTTAC GGCACCTCGA CCCCAAAAAA CTTGATTAGG GTGATGGTTC 24 0 

ACGTAGTGGG CCATCGCCCT GATAGACGGT TTTTCGCCCT TTGACGTTGG AGTCCACGTT 300 

CTTTAATAGT GGACTCTTGT TCCAAACTGG AACAACACTC AACCCTATCT CGGTCTATTC 360 

TTTTGATTTA TAAGGGATTT TGCCGATTTC GGCCTATTGG TTAAAAAATG AGCTGATTTA 4 20 

ACAAAAATTT AACGCGAATT TTAACAAAAT ATTAACGTTT ACAATTTCAG GTGGCACTTT 4 80 

TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA 54 0 

TCCGCTCATG AATTAATTCT TAGAAAAACT CATCGAGCAT CAAATGAAAC TGCAATTTAT 600 

TCATATCAGG ATTATCAATA CCATATTTTT GAAAAAGCCG TTTCTGTAAT G AAG GAG AAA 660 

ACTCACCGAG GCAGTTCCAT AGGATGGCAA GATCCTGGTA TCGGTCTGCG ATTCCGACTC 720 

GTCCAACATC AATACAACCT ATTAATTTCC CCTCGTCAAA AATAAGGTTA TCAAGTGAGA 7 80 

AATCACCATG AGTGACGACT GAATCCGGTG AGAATGGCAA AAGTTTATGC ATTTCTTTCC 840 
AGACTTGTTC AACAGGCCAG CCATTACGCT CGTCATCAAA ATCACTCGCA TCAACCAAAC 900 
CGTTATTCAT TCGTGATTGC GCCTGAGCGA GACGAAATAC GCGATCGCTG TTAAAAGGAC 960 

AATTACAAAC AGGAATCGAA TGCAACCGGC GCAGGAACAC TGCCAGCGCA TCAACAATAT 1020 

TTTCACCTGA ATCAGGATAT TCTTCTAATA CCT GGAATGC TGTTTTCCCG GGGATCGCAG 1080 

TGGTGAGTAA CCATGCATCA TCAGGAGTAC GGATAAAATG CTTGATGGTC GGAAGAGGCA 114 0 
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TAAATTCCGT 


CAGCCAGTTT 


AGTCTGACCA 


TCTCATCTGT 


AACATCATTG 


GCAACGCTAC 


1200 


CTTTGCCATG 


TTTCAGAAAC 


AACTCTGGCG 


CATCGGGCTT 


CCCATACAAT 


CGATAGATTG 


1260 


TCGCACCTGA 


TTGCCCGACA 


TTATCGCGAG 


CCCATTTATA 


CCCATATAAA 


TCAGCATCCA 


1320 


TGTTGGAATT 


TAATCGCGGC 


CTAGAGCAAG 


ACGTTTCCCG 


TTGAATATGG 


CTCATAACAC 


1380 


CCCTTGTATT 


ACTGTTTATG 


TAAGCAGACA 


GTTTTATTGT 


TCATGACCAA 


AATCCCTTAA 


1440 


CGTGAGTTTT 


CGTTCCACTG 


AGCGTCAGAC 


CCCGTAGAAA 


AGATCAAAGG 


ATCTTCTTGA 


1500 


GATCCTTTTT 


TTCTGCGCGT 


AATCTGCTGC 


TTGCAAACAA 


AAAAACCACC 


GCTACCAGCG 


1560 


GTGGTTTGTT 


TGCCGGATCA 


AGAGCTACCA 


ACTCTTTTTC 


CGAAGGTAAC 


TGGCTTCAGC 


1620 


AGAGCGCAGA 


TACCAAATAC 


TGTCCTTCTA 


GTGTAGCCGT 


AGTTAGGCCA 


CCACTTCAAG 


1680 


AACTCTGTAG 


CACCGCCTAC 


ATACCTCGCT 


CTGCTAATCC 


TGTTACCAGT 


GGCTGCTGCC 


1740 


AGTGGCGATA 


AGTCGTGTCT 


TACCGGGTTG 


GACTCAAGAC 


GATAGTTACC 


GGATAAGGCG 


1800 


CAGCGGTCGG 


GCTGAACGGG 


GGGTTCGTGC 


ACACAGCCCA 


GCTTGGAGCG 


AACGACCTAC 


. 1860 


ACCGAACTGA 


GATACCTACA 


GCGTGAGCTA 


TGAGAAAGCG 


CCACGCTTCC 


CGAAGGGAGA 


1920 


AAGGCGGACA 


GGTATCCGGT 


AAGCGGCAGG 


GTCGGAACAG 


GAGAGCGCAC 


GAGGGAGCTT 


1980 


CCAGGGGGAA 


ACGCCTGGTA 


TCTTTATAGT 


CCTGTCGGGT 


TTCGCCACCT 


CTGACTTGAG 


2040 


CGTCGATTTT 


TGTGATGCTC 


GTCAGGGGGG 


CGGAGCCTAT 


GGAAAAACGC 


CAGCAACGCG 


2100 


GCCTTTTTAC 


GGTTCCTGGC 


CTTTTGCTGG 


CCTTTTGCTC 


ACATGTTCTT 


TCCTGCGTTA 


2160 


TCCCCTGATT 


CTGTGGATAA 


CCGTATTACC 


GCCTTTGAGT 


GAGCTGATAC 


CGCTCGCCGC 


2220 


AGCCGAACGA 


CCGAGCGCAG 


CGAGTCAGTG 


AGCGAGGAAG 


CGGAAGAGCG 


CCTGATGCGG 


2280 


TATTTTCTCC 


TTACGCATCT 


GTGCGGTATT 


TCACACCGCA 


TATATGGTGC 


ACTCTCAGTA 


2340 


CAATCTGCTC 


TGATGCCGCA 


TAGTTAAGCC 


AGTATACACT 


CCGCTATCGC 


TACGTGACTG 


2400 


GGTCATGGCT 


GCGCCCCGAC 


ACCCGCCAAC 


ACCCGCTGAC 


GCGCCCTGAC 


GGGCTTGTCT 


2460 


GCTCCCGGCA 


TCCGCTTACA 


GACAAGCTGT 


GACCGTCTCC 


GGGAGCTGCA 


TGTGTCAGAG 


2520 


GTTTTCACCG 


TCATCACCGA 


AACGCGCGAG 


GCAGCTGCGG 


TAAAGCTCAT 


CAGCGTGGTC 


2580 


GTGAAGCGAT 


TCACAGATGT 


CTGCCTGTTC 


ATCCGCGTCC 


AGCTCGTTGA 


GTTTCTCCAG 


2640 






Tr: ataaacjct; 




a r; r; n; r n r; t t t 


rrirnrn/^/-srpy-«rprprp 

1 1 1 tLlbl 1 I 


9 inn 


GGTCACTGAT 


GCCTCCGTGT 


AAGGGGGATT 


TCTGTTCATG 


GGGGTAATGA 


TACCGATGAA 


2760 


ACGAGAGAGG 


ATGCTCACGA 


TACGGGTTAC 


TGATGATGAA 


CATGCCCGGT 


TACTGGAACG 


2820 


TTGTGAGGGT 


AAACAACTGG 


CGGTATGGAT 


GCGGCGGGAC 


CAGAGAAAAA 


TCACTCAGGG 


2880 


TCAATGCCAG 


CGCTTCGTTA 


ATACAGATGT 


AGGTGTTCCA 


CAGGGTAGCC 


AGCAGCATCC 


2940 


TGCGATGCAG 


ATCCGGAACA 


TAATGGTGCA 


GGGCGCTGAC 


TTCCGCGTTT 


CCAGACTTTA 


3000 
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CGAAACACGG 


AAAC C G AAG A 


CCATTCATGT 


TGTTGCTCAG 


GTCGCAGACG 


TTTTGCAGCA 


3060 


GCAGTCGCTT 


CACGTTCGCT 


CGCGTATCGG 


TGATTCATTC 


TGCTAACCAG 


TAAGGCAACC 


3120 


CCGCCAGCCT 


AGCCGGGTCC 


TCAACGACAG 


GAGCACGATC 


ATGCGCACCC 


GTGGGGCCGC 


3180 


CATGCCGGCG 


ATAATGGCCT 


GCTTCTCGCC 


GAAACGTTTG 


GTGGCGGGAC 


CAGTGACGAA 


3240 


GGCTTGAGCG 


AGGGCGTGCA 


AGATTCCGAA 


TACCGCAAGC 


GACAGGCCGA 


TCATCGTCGC 


3300 


GCTCCAGCGA 


AAGCGGTCCT 


CGCCGAAAAT 


GACCCAGAGC 


GCTGCCGGCA 


CCTGTCCTAC 


3360 


GAGTTGCATG 


ATAAAGAAGA 


CAGTCATAAG 


TGCGGCGACG 


ATAGTCATGC 


CCCGCGCCCA 


3420 


CCGGAAGGAG 


CTGACTGGGT 


TGAAGGCTCT 


CAAGGGCATC 


GGTCGAGATC 


CCGGTGCCTA 


3480 


ATGAGTGAGC 


T AACT T AC AT 


TAATTGCGTT 


GCGCTCACTG 


CCCGCTTTCC 


AGTCGGGAAA 


3540 


CCTGTCGTGC 


CAGCTGCATT 


AATGAATCGG 


CCAACGCGCG 


GGGAGAGGCG 


GTTTGCGTAT 


3600 


TGGGCGCCAG 


GGTGGTTTTT 


CTTTTCACCA 


GTGAGACGGG 


CAACAGCTGA 


TTGCCCTTCA 


3660 


CCGCCTGGCC 


CTGAGAGAGT 


TGCAGCAAGC 


GGTCCACGCT 


GGTTTGCCCC 


AGCAGGCGAA 


3720 


AATCCTGTTT 


GATGGTGGTT 


AACGGCGGGA 


TATAACATGA 


GCTGTCTTCG 


GTATCGTCGT 


3780 


ATCCCACTAC 


CGAGATATCC 


GCACCAACGC 


GCAGCCCGGA 


CTCGGTAATG 


GCGCGCATTG 


3840 


CGCCCAGCGC 


CATCTGATCG 


TTGGCAACCA 


GCATCGCAGT 


GGGAACGATG 


CCCTCATTCA 


3900 


GCATTTGCAT 


GGTTTGTTGA 


AAACCGGACA 


TGGCACTCCA 


GTCGCCTTCC 


CGTTCCGCTA 


3960 


TCGGCTGAAT 


TTGATTGCGA 


GTGAGATATT 


TATGCCAGCC 


AGCCAGACGC 


AGACGCGCCG 


4020 


AGACAGAACT 


TAATGGGCCC 


GCTAACAGCG 


CGATTTGCTG 


GTGACCCAAT 


GCGACCAGAT 


4080 


GCTCCACGCC 


CAGTCGCGTA 


CCGTCTTCAT 


GGGAGAAAAT 


AATACTGTTG 


ATGGGTGTCT 


4140 


GGTCAGAGAC 


ATCAAGAAAT 


AACGCCGGAA 


CATTAGTGCA 


GGCAGCTTCC 


ACAGCAATGG 


4200 


CATCCTGGTC 


ATCCAGCGGA 


TAGTTAATGA 


TCAGCCCACT 


GACGCGTTGC 


GCGAGAAGAT 


4260 


TGTGCACCGC 


CGCTTTACAG 


GCTTCGACGC 


CGCTTCGTTC 


TACCATCGAC 


ACCACCACGC 


4320 


TGGCACCCAG 


TTGATCGGCG 


CGAGATTTAA 


TCGCCGCGAC 


AATTTGCGAC 


GGCGCGTGCA 


4380 


GGGCCAGACT 


GGAGGTGGCA 


ACGCCAATCA 


GCAACGACTG 


TTTGCCCGCC 


AGTTGTTGTG 


4440 


CCACGCGGTT 


GGGAATGTAA 


TTCAGCTCCG 


CCATCGCCGC 


TTCCACTTTT 


TCCCGCGTTT 


4500 


TCGCAGAAAC 


GTGGCTGGCC 


TGGTTCACCA 


CGCGGGAAAC 


GGTCTGATAA 


GAGACACCGG 


4560 


CATACTCTGC 


GACATCGTAT 


AACGTTACTG 


GTTTCACATT 


CACCACCCTG 


AATTGACTCT 


4620 


CTTCCGGGCG 


CTATCATGCC 


ATACCGCGAA 


AGGTTTTGCG 


CCATTCGATG 


GTGTCCGGGA 


4680 


TCTCGACGCT 


CTCCCTTATG 


CGACTCCTGC 


ATTAGGAAGC 


AGCCCAGTAG 


TAGGTTGAGG 


4740 


CCGTTGAGCA 


. CCGCCGCCGC 


AAGGAATGGT 


GCATGCAAGG 


AGATGGCGCC 


CAACAGTCCC 


4800 


CCGGCCACGG 


l GGCCTGCCAC 


: CATACCCACG 


CCGAAACAAG 


CGCTCATGAG 


CCCGAAGTGG 


4860 
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CGAGCCCGAT CTTCCCCATC GGTGATGTCG GCGATATAGG CGCCAGCAAC CGCACCTGTG 4 920 

GCGCCGGTGA TGCCGGCCAC GATGCGTCCG GCGTAGAGGA TCGAGATCTC GATCCCGCGA 4 980 

AATTAATACG ACTCACTATA GGGGAATTGT GAGCGGATAA CAATTCCCCT CTAGAAATAA 504 0 

TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGGCCAT CATCATCATC ATCACGTGAT 5100 

CGACATCATC GGGACCAGCC CCACATCCTG GGAACAGGCG GCGGCGGAGG CGGTCCAGCG 5160 

GGCGCGGGAT AGCGTCGATG ACATCCGCGT CGCTCGGGTC ATTGAGCAGG ACATGGCCGT 5220 

GGACAGCGCC GGCAAGATCA CCTACCGCAT CAAGCTCGAA GTGTCGTTCA AGATGAGGCC 52 80 

GGCGCAACCG AGGGGCTCGA AACCACCGAG CGGTTCGCCT GAAACGGGCG CCGGCGCCGG 534 0 

TACTGTCGCG ACTACCCCCG CGTCGTCGCC GGTGACGTTG GCGGAGACCG GTAGCACGCT 54 00 

GCTCTACCCG CTGTTCAACC TGTGGGGTCC GGCCTTTCAC GAGAGGTATC CGAACGTCAC 54 60 

GATCACCGCT CAGGGCACCG GTTCTGGTGC CGGGATCGCG CAGGCCGCCG CCGGGACGGT 5520 

CAACATTGGG GCCTCCGACG CCTATCTGTC GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT 5580 

GATGAACATC GCGCTAGCCA TCTCCGCTCA GCAGGTCAAC TACAACCTGC CCGGAGTGAG 564 0 

CGAGCACCTC AAGCTGAACG GAAAAGTCCT GGCGGCCATG TACCAGGGCA CCATCAAAAC 5700 

CTGGGACGAC CCGCAGATCG CTGCGCTCAA CCCCGGCGTG AACCTGCCCG GCACCGCGGT 57 60 

AGTTCCGCTG CACCGCTCCG ACGGGTCCGG TGACACCTTC TTGTTCACCC AGTACCTGTC 5820 

CAAGCAAGAT CCCGAGGGCT GGGGCAAGTC GCCCGGCTTC GGCACCACCG TCGACTTCCC 58 8 0 

GGCGGTGCCG GGTGCGCTGG GTGAGAACGG CAACGGCGGC ATGGTGACCG GTTGCGCCGA 5 94 0 

GACACCGGGC TGCGTGGCCT ATATCGGCAT CAGCTTCCTC GACCAGGCCA GTCAACGGGG 6000 

ACTCGGCGAG GCCCAACTAG GCAATAGCTC TGGCAATTTC TTGTTGCCCG ACGCGCAAAG 6060 

CATTCAGGCC GCGGCGGCTG GCTTCGCATC GAAAACCCCG GCGAACCAGG CGATTTCGAT 612 0 

GATCGACGGG CCCGCCCCGG ACGGCTACCC GATCATCAAC TACGAGTACG CCATCGTCAA 6180 

CAACCGGCAA AAGGACGCCG CCACCGCGCA GACCTTGCAG GCATTTCTGC ACTGGGCGAT 624 0 

CACCGACGGC AACAAGGCCT CGTTCCTCGA CCAGGTTCAT TTCCAGCCGC TGCCGCCCGC 6300 

GGTGGTGAAG TTGTCTGACG CGTTGATCGC GACGATTTCC AGCGCTGAGA TGAAGACCGA 63 60 

TGCCGCTACC CTCGCGCAGG AGGCAGGTAA TTTCGAGCGG ATCTCCGGCG ACCTGAAAAC 64 20 

CCAGATCGAC CAGGTGGAGT CGACGGCAGG TTCGTTGCAG GGCCAGTGGC GCGGCGCGGC 64 80 

GGGGACGGCC GCCCAGGCCG CGGTGGTGCG CTTCCAAGAA GCAGCCAATA AGCAGAAGCA 654 0 

GGAACTCGAC GAGATCTCGA CGAATATTCG TCAGGCCGGC GTCCAATACT CGAGGGCCGA 6600 

CGAGGAGCAG CAGCAGGCGC TGTCCTCGCA AATGGGCTTT GTGCCCACAA CGGCCGCCTC 6660 

GCCGCCGTCG ACCGCTGCAG CGCCACCCGC ACCGGCGACA CCTGTTGCCC CCCCACCACC 67 20 
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GGCCGCCGCC 


AACACGCCGA 


ATGCCCAGCC 


GGGCGATCCC 


AACGCAGCAC 


CTCCGCCGGC 


6780 


CGACCCGAAC 


GCACCGCCGC 


CACCTGTCAT 


TGCCCCAAAC 


GCACCCCAAC 


CTGTCCGGAT 


6840 


CGACAACCCG 


GTTGGAGGAT 


TCAGCTTCGC 


GCTGCCTGCT 


GGCTGGGTGG 


AGTCTGACGC 


6900 


CGCCCACTTC 


GACTACGGTT 


CAGCACTCCT 


CAGCAAAACC 


ACCGGGGACC 


CGCCATTTCC 


6960 


CGGACAGCCG 


CCGCCGGTGG 


CCAATGACAC 


CCGTATCGTG 


CTCGGCCGGC 


TAGACCAAAA 


7020 


GCTTTACGCC 


AGCGCCGAAG 


CCACCGACTC 


CAAGGCCGCG 


GCCCGGTTGG 


GCTCGGACAT 


7080 


GGGTGAGTTC 


TATATGCCCT 


ACCCGGGCAC 


CCGGATCAAC 


CAGGAAACCG 


TCTCGCTTGA 


7140 


CGCCAACGGG 


GTGTCTGGAA 


GCGCGTCGTA 


TTACGAAGTC 


AAGTTCAGCG 


ATCCGAGTAA 


7200 


GCCGAACGGC 


CAGATCTGGA 


CGGGCGTAAT 


CGGCTCGCCC 


GCGGCGAACG 


CACCGGACGC 


7260 


CGGGCCCCCT 


CAGCGCTGGT 


TTGTGGTATG 


GCTCGGGACC 


GCCAACAACC 


CGGTGGACAA 


7320 




AAGGCGCTGG 


CCGAATCGAT 


CCGGCCTTTG 


GTCGCCCCGC 


CGCCGGCGCC 


7380 


GGCACCGGCT 


CCTGCAGAGC 


CCGCTCCGGC 


GCCGGCGCCG 


GCCGGGGAAG 


TCGCTCCTAC 


7440 


CCCGACGACA 


CCGACACCGC 


AGCGGACCTT 


ACCGGCCTGA 


GAATTCTGCA 


GAT AT C CAT C 


7500 


ACACTGGCGG 


CCGCTCGAGC 


ACCACCACCA 


CCACCACTGA 


GATCCGGCTG 


CTAACAAAGC 


7560 


CCGAAAGGAA 


GCTGAGTTGG 


CTGCTGCCAC 


CGCTGAGCAA 


TAACTAGCAT 


AACCCCTTGG 


7620 


GGCCTCTAAA 


CGGGTCTTGA 


GGGGTTTTTT 


GCTGAAAGGA 


GGAACTATAT 


CCGGAT 


7 67 6 



(2) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 802 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 

Met Gly His His His His His His Val He Asp He He Gly Thr Ser 
15 10 15 

Pro Thr Ser Trp Glu Gin Ala Ala Ala Glu Ala Val Gin Arg Ala Arg 
20 25 30 

Asp Ser Val Asp Asp lie Arg Val Ala Arg Val He Glu Gin Asp Met 
35 ^ 40 45 

Ala Val Asp Ser Ala Gly Lys He Thr Tyr Arg He Lys Leu Glu Val 
50 55 60 

Ser Phe Lvs Met Arg Pro Ala Gin Pro Arg Gly Ser Lys Pro Pro Ser 
65 70 75 80 
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Gly Ser Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro 
85 90 95 

Ala Ser Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr 
100 105 110 

Pro Leu Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn 
115 120 125 

Val Thr He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin 
130 135 140 

Ala Ala Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser 
145 150 155 160 

Glu Gly Asp Met Ala Ala His Lys Gly Leu Met Asn lie Ala Leu Ala 
165 170 175 

. He Ser Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His 
180 185 190 

Leu Lys Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He 
195 200 205 

Lys Thr Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn 
210 ~ 215 220 

Leu Pro Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly 
225 230 235 240 

Asp Thr Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly 
245 250 255 

Trp Gly Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val 
260 265 270 

Pro Gly Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys 
275 280 285 

Ala Glu Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp 
290 295 300 

Gin Ala Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser 
305 310 315 320 

Gly Asn Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala 
325 330 335 

Gly Phe Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp 
340 34 5 350 

Gly Pro Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He 
355 360 365 

Val Asn Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala 
370 375 380 

Phe Leu His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp 
385 390 395 400 

Gin Val His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp 
405 410 415 
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Ala Leu lie Ala 
420 

Thr Leu Ala Gin 
435 

Lys Thr Gin lie 
450 

Gin Trp Arg Gly 
465 

Phe Gin Glu Ala 



Thr Asn lie Arg 
500 

Gin Gin Gin Ala 
515 

Ala Ser Pro Pro 
530 

Val Ala Pro Pro 
545 

Gly Asp Pro Asn 



Pro Pro Val lie 
580 



Pro Val Gly Gly 
595 

Asp Ala Ala His 
610 

Gly Asp Pro Pro 
625 

Arg lie Val Leu 



Ala Thr Asp Ser 
660 



Phe Tyr Met Pro 
675 

Leu Asp Ala Asn 
690 

Phe Ser Asp Pro 
705 

Gly Ser Pro Ala 



Phe Val Val Trp 



Thr lie Ser Ser 



Glu Ala Gly Asn 
440 



Asp Gin Val Glu 
455 

Ala Ala Gly Thr 
470 

Ala Asn Lys Gin 
485 

Gin Ala Gly Val 



Leu Ser Ser Gin 
520 



Ser Thr Ala Ala 
535 

Pro Pro Ala Ala 
550 

Ala Ala Pro Pro 
565 

Ala Pro Asn Ala 



Phe Ser Phe Ala 
600 



Phe Asp Tyr Gly 
615 

Phe Pro Gly Gin 
630 

Gly Arg Leu Asp 
645 

Lys Ala Ala Ala 



Tyr Pro Gly Thr 
680 



Gly Val Ser Gly 
695 

Ser Lys Pro Asn 
710 

Ala Asn Ala Pro 
725 

Leu Gly Thr Ala 



Ala Glu Met Lys 
425 

Phe Glu Arg lie 



Ser Thr Ala Gly 
460 



Ala Ala Gin Ala 
475 

Lys Gin Glu Leu 
490 

Gin Tyr Ser Arg 
505 

Met Gly Phe Val 



Ala Pro Pro Ala 
540 



Ala Asn Thr Pro 
555 

Pro Ala Asp Pro 
570 

Pro Gin Pro Val 
585 

Leu Pro Ala Gly 



Ser Ala Leu Leu 
620 



Pro Pro Pro Val 
635 

Gin Lys Leu Tyr 
650 

Arg Leu Gly Ser 
665 

Arg lie Asn Gin 



Ser Ala Ser Tyr 
700 



Gly Gin lie Trp 
715 



Asp Ala Gly Pro 
730 

Asn Asn Pro Val 



Thr Asp Ala Ala 
430 

Ser Gly Asp Leu 
44 5 

Ser Leu Gin Gly 



Ala Val Val Arg 
480 



Asp Glu lie Ser 
4 95 



Ala Asp Glu Glu 
510 

Pro Thr Thr Ala 
525 

Pro Ala Thr Pro 



Asn Ala Gin Pro 
560 



Asn Ala Pro Pro 
575 

Arg lie Asp Asn 
590 

Trp Val Glu Ser 
605 

Ser Lys Thr Thr 



Ala Asn Asp Thr 
640 



Ala Ser Ala Glu 
655 

Asp Met Gly Glu 
670 

Glu Thr Val Ser 
685 

Tyr Glu Val Lys 



Thr Gly Val lie 
720 



Pro Gin Arg Trp 
735 

Asp Lys Gly Ala 
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740 



Ala Lys Ala Leu Ala 
755 

Ala Pro Ala Pro Ala 
770 

Gly Glu Val Ala Pro 
785 

Pro Ala 



745 

Glu Ser lie Arg Pro 
760 

Pro Ala Glu Pro Ala 
775 

Thr Pro Thr Thr Pro 
790 



750 

Leu Val Ala Pro Pro Pro 
765 

Pro Ala Pro Ala Pro Ala 
780 



Thr Pro Gin Arg Thr Leu 
795 800 
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CLAIMS 



1. A polypeptide comprising an immunogenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 

from the group consisting of: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 

Val-Val-Ala- Ala-Leu; (SEQ ID No. 120) 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser; 

(SEQ ID No. 121) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro; 

(SEQ ID No. 123) 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; (SEQ 
ID No. 124) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID No. 
125) 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 

Ser; (SEQ ID No. 126) 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly; 

(SEQ ID No. 127) 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn; (SEQ 
ID No. 128) and 

(j) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 
(SEQ ID No. 136) 
wherein Xaa may be any amino acid. 

2. A polypeptide comprising an immunogenic portion of an 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
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substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) and 

(b) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-GIy-Lys-Ile- 
Asn-Val-His-Leu-Val; (SEQ ID No. 137), wherein Xaa may be any 
amino acid. 

3. A polypeptide comprising an immunogenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 99 and 101, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 
99 and 101 or a complement thereof under moderately stringent conditions. 

4. A polypeptide comprising an immunogenic portion of a 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID Nos.: 26-51, 138, 139, 163-183 and 201, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 26-51, 138, 139, 163- 
1 83 and 201 or a complement thereof under moderately stringent conditions. 

5. A DNA molecule comprising a nucleotide sequence encoding a 
polypeptide according to any one of claims 1-4. 

6. An expression vector comprising a DNA molecule according to 

claim 5. 

7. A host cell transformed with an expression vector according to claim 6. 
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8. The host cell of claim 7 wherein the host cell is selected from the group 
consisting of E. coli, yeast and mammalian cells. 

9. A pharmaceutical composition comprising one or more polypeptides 
according to any one of claims 1-4 and a physiologically acceptable carrier. 

10. A pharmaceutical composition comprising one or more DNA 
molecules according to claim 5 and a physiologically acceptable carrier. 

11. A pharmaceutical composition comprising one or more DNA 
sequences recited in SEQ ID Nos.: 3, 1 1, 12, 140 and 141; and a physiologically acceptable 
carrier. 

12. A vaccine comprising one or more polypeptides according to any one 
of claims 1-4 and a non-specific immune response enhancer. 

13. A vaccine comprising: 

a polypeptide having an N-terminal sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 134 and 135; and 
a non-specific immune response enhancer. 

14. A vaccine comprising: 

one or more polypeptides encoded by a DNA sequence selected from the 
group consisting of SEQ ID Nos.: 3, 11, 12, 140 and 141, the complements of said 
sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 3, 1 1, 

12, 140 and 141; and 

a non-specific immune response enhancer. 

15. The vaccine of claims 12-14 wherein the non-specific immune 
response enhancer is an adjuvant. 
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16. A vaccine comprising one or more DNA molecules according to claim 
5 and a non-specific immune response enhancer. 

17. A vaccine comprising one or more DNA sequences recited in SEQ ID 
Nos.: 3, 1 1, 12, 140 and 141; and a non-specific immune response enhancer. 

18. The vaccine of claims 16 or 17 wherein the non-specific immune 
response enhancer is an adjuvant. 

19. A pharmaceutical composition according to any one of claims 9-11, for 
use in the manufacture of a medicament for inducing protective immunity in a patient. 

20. A vaccine according to any one of claims 12-18, for use in the 
manufacture of a medicament for inducing protective immunity in a patient. 

21. A fusion protein comprising two or more polypeptides according to 
any one of claims 1-4. 

22. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and ESAT-6. 

23. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and the Af. tuberculosis antigen 38 kD (SEQ ID NO: 155). 

24. A pharmaceutical composition comprising a fusion protein according 
to any one of claims 21-23 and a physiologically acceptable carrier. 

25. A vaccine comprising a fusion protein according to any one of claims 
21-23 and a non-specific immune response enhancer. 

26. The vaccine of claim 25 wherein the non-specific immune response 
enhancer is an adjuvant. 
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27. A pharmaceutical composition according to claim 24, for use in the 
manufacture of a medicament for inducing protective immunity in a patient. 

28. A vaccine according to claims 25 or 26, for use in the manufcture of a 
medicament for inducing protective immunity in a patient. 

29. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with one or more polypeptides 
according to any one of claims 1-4; and 

(b) detecting an immune response on the patient's skin and therefrom 
detecting tuberculosis in the patient. 

30. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with a polypeptide having an N- 
terminal sequence selected from the group consisting of sequences recited in SEQ ID NO: 
134 and 135; and 

(b) detecting an immune response on the patient's skin and therefrom 
detecting tuberculosis in the patient. 

31. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with one or more polypeptides 
encoded by a DNA sequence selected from the group consisting of SEQ ID Nos.: 3, 1 1, 12, 
140, 141, 156-160, 189-193, 199, 200 and 203, the complements of said sequences, and DNA 
sequences that hybridize to a sequence recited in SEQ ID Nos.: 3, 1 1, 12, 140, 141, 156-160, 
189-193, 199, 200 and 203; and 

(b) detecting an immune response on the patient's skin and therefrom 

detecting tuberculosis in the patient. 

32. The method of any one of claims 29-31 wherein the immune response 

is induration. 
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33. A diagnostic kit comprising: 

(a) a polypeptide according to any one of claims 1 -4; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 



34. A diagnostic kit comprising: 

(a) a polypeptide having an N-terminal sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 134 and 135; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 

35. A diagnostic kit comprising: 

(a) a polypeptide encoded by a DNA sequence selected from the group 
consisting of SEQ ID Nos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200 and 203, the 
complements of said sequences, and DNA sequences that hybridize to a sequence recited in 
SEQ ID Nos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200 and 203; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 



36. A diagnostic kit comprising: 

(a) a fusion protein according to any one of claims 21-23; and 

(b) apparatus sufficient to contact said fusion protein with the dermal cells of a 
patient. 

37. A fusion protein according to claim 23 comprising an amino acid 
sequence selected from the group consisting of sequences recited in SEQ ID NO: 153 and 
209. 
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an extent that no meaningful International Search can be carried out, specifically: 

3 ' I— -J because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 


Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 


This International Searching Authority found multiple inventions in this international application, as follows: 
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I ' searchable claims. 
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1 1 covers only those claims for which fees were paid, specifically claims Nos.: 
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' restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 

See annex subject 1 - 
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1. Claims: 1, 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



A polypeptide comprising an immunogenic portion of a soluble 
M. tuberculosis antigen or a variant, having an N-terminal 
aminoacid sequence as in Seq.ID:12Q. A DNA molecule encoding 
said polypeptide as in Seq.IDrlOl. An expression vector 
comprising said DNA molecule, a host transformed with said 
expression vector. A pharmaceutical composition or vaccine 
comprising said polypeptide or said DNA molecule. Fusion 
protein comprising said polypeptide and pharmaceutical 
composition or vaccine therof. A diagnostic kit comprising 
said polypeptide or fusion protein. 



2. Claims: 1, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:121. 



3. Claims: 1, 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:122 and 25. 



4. Claims: 1, 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:123 and 24. 



5. Claims: 1, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:124. 



6. Claims: 1, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:125. 



7. Claims: 1, 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 
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Same as invention 1 but for Seq.ID:126 and 52. 

8. Claims: 1, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:127. 

9. Claims: 1, 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:128 and 99. 

10. Claims: 1, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:136. 

11. Claims: 2, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:129. 

12. Claims: 2, 5-10, 12, 15, 16, 18-29, 31-33, 

35-37 all partially. 

Same as invention 1 but for Seq.ID:137 and 203. 

13. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:!. 

14. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:2. 

15. Claims: 3, 5-10, 12, 15*, 16, 18-29, 32, 33, 36, 

37 all partially. 
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Same as invention 1 but for Seq.ID:4 and 17. 

16. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:5. 

17. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:6. 

18. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:7. 

19. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:8. 

20. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially 

Same as invention 1 but for Seq.ID:9. 

21. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:10 and 13. 

22. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:14. 

23. Claims: 3, 5-10, 12, 15,* 16, 18-29, 32, 33, 36, 

37 all partially. 
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Same as invention 1 but for Seq.ID:15 and 153. 

24. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:16. 

25. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:18. 

26. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 al 1 parti al ly. 

Same as invention 1 but for Seq.ID:19. 

27. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:20. 

28. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 al 1 parti al ly. 

Same as invention 1 but for Seq.ID:21. 

29. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 al 1 parti al ly. 

Same as invention 1 but for Seq.ID:22. 

30. Claims: 3, 5-10, 12, 15, 16, 18-29, 32, 33, 36, 

37 all partially. 

Same as invention 1 but for Seq.ID:23. 

31. Claims: 3, 5-10, 12,' 15, 16, 18-29, 32, 33, 36, 
I 37 all partially. 
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Same as invention 1 but for Seq. 10:99. 



32. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.lD:26. 



33. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 al 1 partial ly. 



Same as invention 1 but for Seq.ID:27. 



34. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:28. 



35. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:29. 



36. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:30. 



37. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:31. 



38. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:32. 



59. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 
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Same 


as invention 1 but for Seq.ID:33. 


40. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:34. 


41. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 al 1 partial ly. 




Same 


as invention 1 but for Seq.ID:35. 


42. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:36. 


43. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:37. 


44. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:38. 


45. 


Claims: 


4-10, 12,- 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:39. 


46. 


C 1 a i ms : 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partial ly. 




Same 


as invention 1 but for Seq.ID:40. 


47. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 
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Same as invention 1 but for Seq.ID:41. 

48. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.lD:42. 

49. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.ID:43, 44 and 183. 



50. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:45. 



51. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:46 and 153. 



52. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:47. 



53. Claims: 4-10, 12,- 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:48. 



54. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 



Same as invention 1 but for Seq.ID:49. 

55. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 
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Same 


as invention 1 but for Seq.ID:50. 


56. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:51. 


57. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:138. 


58. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:139. 


59. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.lD:163. 


60. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




Same 


as invention 1 but for Seq.ID:165. 


61. 


Claims: 


4-10, 12,- 15, 16, 18-29, 32, 33, 36, 
37 al 1 partial ly. 




Same 


as invention 1 but for Seq.ID:166. 


62. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 




J QlliC 


as invention 1 out Tor jeq . lu. jlo/ . 


63. 


Claims: 


4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 
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Same as invention 1 but for Seq.ID:168. 

64. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.ID:169 and 170. 

65. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.ID:171 and 172. 

66. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.ID:173 and 174. 

67. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.ID:175 and 176. 

68. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.ID:177 and 178. 

69. Claims: 4-10, 12,- 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.ID:179 and 180. 

70. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

Same as invention 1 but for Seq.ID:181 and 182. 

71. Claims: 4-10, 12, 15, 16, 18-29, 32, 33, 36, 
37 all partially. 

i . 

BNSDOC1D: <WO 9816646A3_I_> 



International Application No. PCT/ US 97/18293 



FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 




Same 


as invention 1 but for Seq.ID:201. 


72. 


Claims: 


11, 14, 15, 17-20, 31, 32, 35 all partially. 




Pharmaceutical composition or vaccine comprising a DNA 
sequence as in Seq.ID:3. Vaccine comprising a polypeptide 
encoded by said Seq.ID:3, complement or hybridizing 
sequences thereof. Use of said polypeptide in diagnostic. 


73 


Claims: 


11, 14, 15, 17-20, 31, 32, 35 all partially. 




j> auic 


ab i fivciit i un ic uul Tor jeq.iu:ii. 


74 . 


Claims: 


11, 14, 15, 17-20, 31, 32, 35 all partially. 




Same 


as invention 72 but for Seq. ID: 12. 


75 


Claims: 


11, 14, 15, 17-20, 31, 32, 35 all partially. 




Same 


as invention 72 but for Seq.ID:140. 


76. 


Claims: 


11, 14, 15, 17-20, 31, 32, 35 all partially. 




Same 


as invention 72 but for Seq.ID:141. 


77. 


Claims: 


13, 15, 20, 30, 32, 34 all partially. 




Vaccine comprising a polypeptide having an N-terminal 
sequence as in Seq. ID: 134. Use of said polypeptide in 
diagnostic. 


78. 


Claims: 


13, 15, 20, 30, 32, 34 all partially. 




Same 


as invention 77 but for Seq. ID: 135. 


79. 


Claims : 


31, 32, 35 all partially. 




Use in diagnostic of a polypeptide encoded by a DNA sequence 
as in Seq. ID: 156. 


80. 


Claims: 


31, 32, 35 all partially. 




Same 


as invention 79 but for Seq.ID:157. 


81. 


Claims: 


31, 32, 35 all partially. 
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Same as invention 79 but for Seq.ID:158. 

82. Claims: 31, 32, 35 all partially. 

Same as invention 79 but for Seq.ID:159 and 160. 

83. Claims: 31, 32, 35 all partially. 

Same as invention 79 but for Seq.ID:189. 

84. Claims: 31, 32, 35 all partially. 

Same as invention 79 but for Seq.ID:190. 

85. Claims: 31, 32, 35 all partially. 

Same as invention 79 but for Seq.ID:191. 

85. Claims: 31, 32, 35 all partially. 

Same as invention 79 but for Seq.ID:192. 

87. Claims: 31, 32, 35 all partially. 

Same as invention 79 but for Seq.ID:193. 

88. Claims: 31, 32, 35 all partially. 

Same as invention 79 but for Seq.ID:199 and 200. 

89. Claims: 31, 32, 35 all partially. 

Same as invention 79 but for Seq.ID:203. 



Polypeptides comprising an immunogenic portion of a soluble M. 
tuberculosis antigen are well documented in the prior art. 
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